gks-anvil / vrs_anvil_toolkit

Extract clinical variant interpretations from VCF using GA4GH VRS IDs
MIT License
2 stars 1 forks source link

feature: use diskcache #8

Closed bwalsh closed 8 months ago

bwalsh commented 8 months ago

Use cases

Potential solutions

https://github.com/grantjenks/python-diskcache

quinnwai commented 8 months ago

Testing locally with 1000G patients

anecdotally, running ~100k variants, there is a significant difference in running pytest test_vcf_to_gnomad.py for the first time (9.5s) vs the second time after the persistent cache (1s) for the same sample (HG00096). When running the same command for a new sample (HG00099), there is a nontrivial speedup (4.5s) as well the first time as well as the second time (1.3s).