Chromosome and position can be stored as a long (rather than a ChrPosition object) when interrogating annotation files.
Motivation for this change is that some of the annotation sources are large (dbNSFP has ~ 84M records, gnomad r3.1.2 ~750M records), and if the number of objects being created can be reduced, then garbage collectors have less to do, and applications should run faster.
To facilitate this change, an extra parameter has been added to the AnnotationSources json file called chrStartsWithChr. This field accepts a boolean value and indicates if the contigs within the annotation source file starts with the string "chr" eg. "chr1", "chr2", etc...)
How Has This Been Tested?
Existing tests pass, has been run and compared against nanno output from regression tests.
Are WDL Updates Required?
yes, the imports/annotate/nannoCreateJson.wdl wdl file will need to be updated and for each NannoInput, the chrStartsWithChr attribute will need to be added.
Checklist:
[X] My code follows the style guidelines of this project
[X] I have performed a self-review of my own code
[X] I have commented my code, particularly in hard-to-understand areas
[X] I have made corresponding changes to the documentation
[X] My changes generate no new warnings
[X] I have added tests that prove my fix is effective or that my feature works
[X] New and existing unit tests pass locally with my changes
Description
Chromosome and position can be stored as a long (rather than a
ChrPosition
object) when interrogating annotation files. Motivation for this change is that some of the annotation sources are large (dbNSFP has ~ 84M records, gnomad r3.1.2 ~750M records), and if the number of objects being created can be reduced, then garbage collectors have less to do, and applications should run faster.To facilitate this change, an extra parameter has been added to the AnnotationSources json file called
chrStartsWithChr
. This field accepts a boolean value and indicates if the contigs within the annotation source file starts with the string "chr" eg. "chr1", "chr2", etc...)How Has This Been Tested?
Existing tests pass, has been run and compared against nanno output from regression tests.
Are WDL Updates Required?
yes, the
imports/annotate/nannoCreateJson.wdl
wdl file will need to be updated and for eachNannoInput
, thechrStartsWithChr
attribute will need to be added.Checklist: