wustl-oncology / analysis-wdls

Scalable genomic analysis pipelines, written in WDL
MIT License
5 stars 11 forks source link

Insufficient storage to complete VEP step for germline variant annotation #162

Closed malachig closed 1 month ago

malachig commented 1 month ago

We have encountered a few instances where the amount of disk need to complete the VEP annotation of germline variants is insufficient. Sometimes we get an out of disk error written to the log. Other times it just fails but the VCF output is only partially complete.

The current calculations don't seem very conservative: https://github.com/wustl-oncology/analysis-wdls/blob/d3156afe5b7bb9a722d2c29289b283182c2f26cd/definitions/tools/vep.wdl#L29-L32

e.g. compression of a file can reduce its size by more that 1/2 and the VEP annotations being added can be huge.