mskcc / pluto-cwl

CWL workflows for helix filter scripts
1 stars 6 forks source link

High temp storage usage with samples_fillout_index_batch_workflow.cwl #106

Open stevekm opened 2 years ago

stevekm commented 2 years ago

This workflow

https://github.com/mskcc/pluto-cwl/blob/master/cwl/samples_fillout_index_batch_workflow.cwl

and by extension its subworkflows

are showing very high temporary storage space usage. I am guessing it is due to the .bam indexing step in the workflow because I believe that step copies all input .bam files to temp staging dir.

When running Miller project with 288 samples, the work dir took up 1TB of space before any jobs started, then dropped down to about 20GB of space after jobs started running.

Need to keep an eye on this and potentially see if it can be fixed because it will eventually cause things to break on large runs where we dont have 1TB of free tmp space to rely on

stevekm commented 1 year ago

consider updating pluto/run-toil.sh to include a du background process that can track disk usage while workflow is running