umccr / umccrise

:snake: DRAGEN Tumor/Normal workflow post-processing
https://umccr.github.io/umccrise/
MIT License
22 stars 8 forks source link

Compress intermediate files #90

Closed pdiakumis closed 1 year ago

pdiakumis commented 2 years ago

Need to bgzip + tabix a few of the intermediate VCFs to save a bit of space (noticed >20x diff in some cases):

Also need to compress/remove the FASTQs under work/<sbj>/oncoviruses/work:

total 12G
drwxr-sr-x 3 pd8253 gx8 4.0K May 31 19:19 detect_viral_reference
-rw-r--r-- 1 pd8253 gx8 2.5G May 31 19:19 step1_host_unmapped_or_mate_unmapped.namesorted.bam
-rw-r--r-- 1 pd8253 gx8 4.4G May 31 19:20 step2_host_unmapped_or_mate_unmapped.R1.fq
-rw-r--r-- 1 pd8253 gx8 4.4G May 31 19:20 step2_host_unmapped_or_mate_unmapped.R2.fq
-rw-r--r-- 1 pd8253 gx8    0 May 31 19:18 step2_host_unmapped_or_mate_unmapped.single.fq
pdiakumis commented 1 year ago

We might go back and remove all work folders if it cuts down costs considerably. Closing.