AdamaJava / adamajava

Other
14 stars 4 forks source link

refactor(qannotate nanno): use long to store chr and position #348

Closed holmeso closed 6 months ago

holmeso commented 6 months ago

Description

Chromosome and position can be stored as a long (rather than a ChrPosition object) when interrogating annotation files. Motivation for this change is that some of the annotation sources are large (dbNSFP has ~ 84M records, gnomad r3.1.2 ~750M records), and if the number of objects being created can be reduced, then garbage collectors have less to do, and applications should run faster.

To facilitate this change, an extra parameter has been added to the AnnotationSources json file called chrStartsWithChr. This field accepts a boolean value and indicates if the contigs within the annotation source file starts with the string "chr" eg. "chr1", "chr2", etc...)

How Has This Been Tested?

Existing tests pass, has been run and compared against nanno output from regression tests.

Are WDL Updates Required?

yes, the imports/annotate/nannoCreateJson.wdl wdl file will need to be updated and for each NannoInput, the chrStartsWithChr attribute will need to be added.

Checklist: