A couple of changes to support qsignature's foray into the world of different (larger) positions files.
First, some instances where the TabTokenizer class was being used unnecessarily have been replaced with a less cpu intensive process (according to jvisualise)
And secondly, a new option (maxCacheSize) has been introduced that is used by the Compare class.
This option allows the user to specify the number of qsignature vcf files that should be stored in cache for the comparison.
Traditionally, all files would be stored in cache and then the comparisons performed, but when positions files are large, and therefore qsignature vcf files are also large, this method has the potential to cause OutOfMemory errors.
The downside to limiting the cache size is that it means that some files will be loaded more than once, pushing out the runtime as well as increasing the I/O impact.
Type of change
Please delete options that are not relevant.
[X] New feature (non-breaking change which adds functionality)
[X] This change requires a documentation update
How Has This Been Tested?
Unit tests have been expended.
Updated code has been run and compared against existing code with identical results.
Are WDL Updates Required?
No wdl updates are required. The default behaviour of the new option is to proceed as it would have previously.
Checklist:
[X] My code follows the style guidelines of this project
[X] I have performed a self-review of my own code
[X] I have commented my code, particularly in hard-to-understand areas
[ ] I have made corresponding changes to the documentation
[X] My changes generate no new warnings
[X] I have added tests that prove my fix is effective or that my feature works
[X] New and existing unit tests pass locally with my changes
Description
A couple of changes to support qsignature's foray into the world of different (larger) positions files. First, some instances where the
TabTokenizer
class was being used unnecessarily have been replaced with a less cpu intensive process (according to jvisualise)And secondly, a new option (maxCacheSize) has been introduced that is used by the
Compare
class. This option allows the user to specify the number of qsignature vcf files that should be stored in cache for the comparison. Traditionally, all files would be stored in cache and then the comparisons performed, but when positions files are large, and therefore qsignature vcf files are also large, this method has the potential to causeOutOfMemory
errors.The downside to limiting the cache size is that it means that some files will be loaded more than once, pushing out the runtime as well as increasing the I/O impact.
Type of change
Please delete options that are not relevant.
How Has This Been Tested?
Unit tests have been expended. Updated code has been run and compared against existing code with identical results.
Are WDL Updates Required?
No wdl updates are required. The default behaviour of the new option is to proceed as it would have previously.
Checklist: