exomiser / Exomiser

A Tool to Annotate and Prioritize Exome Variants
https://exomiser.readthedocs.io
GNU Affero General Public License v3.0
190 stars 54 forks source link

Variants not being passed into Exomiser #496

Open brendan-kearney opened 1 year ago

brendan-kearney commented 1 year ago

Hello,

I have been using Exomiser for some time but have a specific issue that is causing some confusion:

I have a list of a handful of variants (1-3) that I want to input into Exomiser and essentially return as an output without any meaningful filters (most likely these variants will fail standard filters like inheritance/quality/etc.). For these my config file looks something like this:

analysis:
    # hg19 or hg38 - ensure that the application has been configured to run the specified assembly otherwise it will halt.
    genomeAssembly: hg38
    vcf: trio.vcf.gz
    ped: trio.ped
    proband: proband_id
    hpoIds: []
    # These are the default settings, with values representing the maximum minor allele frequency in percent (%) permitted for an
    # allele to be considered as a causative candidate under that mode of inheritance.
    # If you just want to analyse a sample under a single inheritance mode, delete/comment-out the others. For AUTOSOMAL_RECESSIVE
    # or X_RECESSIVE ensure *both* relevant HOM_ALT and COMP_HET modes are present.
    # In cases where you do not want any cut-offs applied an empty map should be used e.g. inheritanceModes: {}
    inheritanceModes: {
            AUTOSOMAL_DOMINANT: 0.1,
            AUTOSOMAL_RECESSIVE_HOM_ALT: 0.1,
            AUTOSOMAL_RECESSIVE_COMP_HET: 2.0,
            #X_DOMINANT: 0.1,
            #X_RECESSIVE_HOM_ALT: 0.1,
            #X_RECESSIVE_COMP_HET: 2.0,
            #MITOCHONDRIAL: 0.2
    }
    #FULL or PASS_ONLY
    analysisMode: PASS_ONLY
    geneScoreMode: RAW_SCORE
    #Possible frequencySources:
    #Thousand Genomes project - http://www.1000genomes.org/ (THOUSAND_GENOMES)
    #TOPMed - https://www.nhlbi.nih.gov/science/precision-medicine-activities (TOPMED)
    #UK10K - http://www.uk10k.org/ (UK10K)
    #ESP project - http://evs.gs.washington.edu/EVS/ (ESP_)
    #   ESP_AFRICAN_AMERICAN, ESP_EUROPEAN_AMERICAN, ESP_ALL,
    #ExAC project http://exac.broadinstitute.org/about (EXAC_)
    #   EXAC_AFRICAN_INC_AFRICAN_AMERICAN, EXAC_AMERICAN,
    #   EXAC_SOUTH_ASIAN, EXAC_EAST_ASIAN,
    #   EXAC_FINNISH, EXAC_NON_FINNISH_EUROPEAN,
    #   EXAC_OTHER
    #gnomAD - http://gnomad.broadinstitute.org/ (GNOMAD_E, GNOMAD_G)
    frequencySources: [
        THOUSAND_GENOMES,
        TOPMED,
        UK10K,

        ESP_AFRICAN_AMERICAN, ESP_EUROPEAN_AMERICAN, ESP_ALL,

        EXAC_AFRICAN_INC_AFRICAN_AMERICAN, EXAC_AMERICAN,
        EXAC_SOUTH_ASIAN, EXAC_EAST_ASIAN,
        EXAC_FINNISH, EXAC_NON_FINNISH_EUROPEAN,
        EXAC_OTHER,

        GNOMAD_E_AFR,
        GNOMAD_E_AMR,
#        GNOMAD_E_ASJ,
        GNOMAD_E_EAS,
        GNOMAD_E_FIN,
        GNOMAD_E_NFE,
        GNOMAD_E_OTH,
        GNOMAD_E_SAS,

        GNOMAD_G_AFR,
        GNOMAD_G_AMR,
#        GNOMAD_G_ASJ,
        GNOMAD_G_EAS,
        GNOMAD_G_FIN,
        GNOMAD_G_NFE,
        GNOMAD_G_OTH,
        GNOMAD_G_SAS
        ]
    #Possible pathogenicitySources: POLYPHEN, MUTATION_TASTER, SIFT, CADD, REMM
    #REMM is trained on non-coding regulatory regions
    #*WARNING* if you enable CADD or REMM ensure that you have downloaded and installed the CADD/REMM tabix files
    #and updated their location in the application.properties. Exomiser will not run without this.
    pathogenicitySources: [POLYPHEN, MUTATION_TASTER, SIFT, REMM, CADD, REVEL, MVP, M_CAP, MPC]
    #this is the standard exomiser order.
    #all steps are optional
    steps: [
        #intervalFilter: {interval: 'chr10:123256200-123256300'},
        # or for multiple intervals:
        intervalFilter: {intervals: ['chr10:123256200-123256201', 'chr2:14535991-14535992', 'chr14:134253-134254']},
        # or using a BED file - NOTE this should be 0-based, Exomiser otherwise uses 1-based coordinates in line with VCF
        #intervalFilter: {bed: /full/path/to/bed_file.bed},
        #genePanelFilter: {geneSymbols: ['FGFR1','FGFR2']},
        #variantEffectFilter: {remove: [NON_CODING_TRANSCRIPT_EXON_VARIANT, THREE_PRIME_UTR_EXON_VARIANT, INTERGENIC_VARIANT, CODING_TRANSCRIPT_INTRON_VARIANT, NON_CODING_TRANSCRIPT_INTRON_VARIANT, UPSTREAM_GENE_VARIANT, FIVE_PRIME_UTR_EXON_VARIANT, FIVE_PRIME_UTR_INTRON_VARIANT, DOWNSTREAM_GENE_VARIANT, REGULATORY_REGION_VARIANT, SYNONYMOUS_VARIANT, THREE_PRIME_UTR_INTRON_VARIANT]},
        #failedVariantFilter: {},
        #regulatoryFeatureFilter: {},
        #qualityFilter: {minQuality: 50.0},
        #frequencyFilter: {maxFrequency: 0.1},
        pathogenicityFilter: {keepNonPathogenic: true},
        #inheritanceFilter: {},
        #omimPrioritiser: {},
        #hiPhivePrioritiser: {runParams: 'human'},
        #phenixPrioritiser: {},
    ]
outputOptions:
    outputContributingVariantsOnly: false
    #numGenes options: 0 = all or specify a limit e.g. 500 for the first 500 results
    numGenes: 0
    # Path to the desired output directory. Will default to the 'results' subdirectory of the exomiser install directory
    outputDirectory: output/trio-out
    # Filename for the output files. Will default to {input-vcf-filename}-exomiser
    #outputFileName: Pfeiffer-hiphive-genome-PASS_ONLY
    #out-format options: HTML, JSON, TSV_GENE, TSV_VARIANT, VCF (default: HTML)
    outputFormats: [HTML, TSV_VARIANT, VCF]

My issue is that despite no filters being active that should exclude variants, my output files are regularly missing variants (about 50% in the samples I've tested). My question is: are there conditions that a variant must meet to be analyzed by Exomiser, even with no filters activated? I can give a more detailed overview of the VCF annotations present if needed.

Thank you for the help.

brendan-kearney commented 1 year ago

Just a follow up to what I think is the issue in more detail:

Looking at the genotypes of specific variants I am interested they follow a pattern like: Proband - 0/0, Mother - 1/1, Father - 1/0 It makes sense now why Exomiser is not analyzing these due to a lack of alt allele in the proband, but is there any way I can set it up to at least process them? I've tried disabling inheritance modes (In cases where you do not want any cut-offs applied an empty map should be used e.g. inheritanceModes: {}), but I don't notice any difference.

julesjacobsen commented 1 year ago

Yes, this behavior is hard-coded to remove alleles which are not relevant to the proband. What are you trying to use Exomiser for in this case? Is it purely to annotate the variants or to use the phenotype analysis part too?