exomiser / Exomiser

A Tool to Annotate and Prioritize Exome Variants
https://exomiser.readthedocs.io
GNU Affero General Public License v3.0
202 stars 55 forks source link

ArrayIndexOutOfBoundsException in FrequencyData.maxFrequency() #565

Closed pzweuj closed 2 months ago

pzweuj commented 3 months ago

Stack Trace

Here is the stack trace from the exception:

java.lang.ArrayIndexOutOfBoundsException: Index -1 out of bounds for length 1
        at org.monarchinitiative.exomiser.core.model.frequency.FrequencyData.frequency(FrequencyData.java:169) ~[exomiser-core-14.0.0.jar:na]
        at org.monarchinitiative.exomiser.core.model.frequency.FrequencyData.maxFrequency(FrequencyData.java:327) ~[exomiser-core-14.0.0.jar:na]

Relevant Code Snippets

An ArrayIndexOutOfBoundsException is thrown when trying to access the array with index -1.

The maxFrequency() method:

https://github.com/exomiser/Exomiser/blob/ffb2da103ca54a9de03ceecfc70c8ddfcebba3fd/exomiser-core/src/main/java/org/monarchinitiative/exomiser/core/model/frequency/FrequencyData.java#L326-L328

The frequency() method: https://github.com/exomiser/Exomiser/blob/ffb2da103ca54a9de03ceecfc70c8ddfcebba3fd/exomiser-core/src/main/java/org/monarchinitiative/exomiser/core/model/frequency/FrequencyData.java#L168-L170

The findMaxSource() method: https://github.com/exomiser/Exomiser/blob/ffb2da103ca54a9de03ceecfc70c8ddfcebba3fd/exomiser-core/src/main/java/org/monarchinitiative/exomiser/core/model/frequency/FrequencyData.java#L114-L125

Environment Details

Exomiser version: 14.0.0 Java(TM) SE Runtime Environment (build 18.0.2.1+1-1)

Suggested Improvement

public Frequency maxFrequency() { 
    return (this.size == 0 || maxSource == -1) ? null : frequency(maxSource); 
} 
pzweuj commented 3 months ago

Further investigation revealed that the issue occurred because the THOUSAND_GENOMES annotation included a population frequency of 0.0000, while other databases were empty, causing findMaxSource to ultimately return -1.

Yaml

analysis:
  analysisMode: PASS_ONLY
  frequencySources:
  - GNOMAD_E_AFR
  - GNOMAD_E_AMR
  - GNOMAD_E_EAS
  - GNOMAD_E_NFE
  - GNOMAD_E_SAS
  - GNOMAD_G_AFR
  - GNOMAD_G_AMR
  - GNOMAD_G_EAS
  - GNOMAD_G_NFE
  - GNOMAD_G_SAS
  - THOUSAND_GENOMES
  genomeAssembly: hg19

outputOptions:
  numGenes: 0
  outputContributingVariantsOnly: false
  outputFormats:
  - TSV_VARIANT

Test Vcf

##fileformat=VCFv4.2
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  Test
11  34916640    .   T   C   28.3    PASS    .   GT:AD:DP:GQ:PL:VAF  0/1:58,36:94:28:28,0,45:0.382979
julesjacobsen commented 2 months ago

@pzweuj thanks for the feedback and example case. This is now fixed in 14.0.1