wustl-oncology / analysis-wdls

Scalable genomic analysis pipelines, written in WDL
MIT License
5 stars 11 forks source link

Out of disk error during gather bam step #160

Closed malachig closed 1 month ago

malachig commented 1 month ago

During a recent run we encountered the following errors:

From the cromwell log:

Sep 28 16:19:38 obi-immuno-pici-4 java[14038]: 2024-09-28 16:19:38,975 cromwell-system-akka.dispatchers.engine-dispatcher-8650 INFO  - WorkflowManagerActor: Workflow 42404258-3975-4423-b8ed-187fbcb2b5d1 failed (during ExecutingWorkflowState): The return code file for job doBqsr.GatherBamFiles:NA:2 was empty.
Sep 28 16:19:38 obi-immuno-pici-4 java[14038]: Check the content of stderr for potential additional information: gs://griffith-lab-test-obi/cromwell-executions/immuno/42404258-3975-4423-b8ed-187fbcb2b5d1/call-somaticExome/somaticExome/e8a5c862-fae4-43aa-b931-7046c38f42db/call-tumorAlignment/sequenceToBqsr/7ef29f30-2529-4408-ac76-77fe92c27a20/call-doBqsr/doBqsr/37408035-5783-4ad3-9c8f-839507829aaa/call-GatherBamFiles/attempt-2/stderr.

From that stderr:

To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
htsjdk.samtools.util.RuntimeIOException: java.io.IOException: No space left on device
...
cp: error writing 'tumor.bam.bai': No space left on device

This appears to be happening in this step: https://github.com/wustl-oncology/analysis-wdls/blob/main/definitions/tools/bqsr.wdl

Specifically in the GatherBamFiles task.

Where space is requested as follows: https://github.com/wustl-oncology/analysis-wdls/blob/762160194d83b506d6d7fb1a57ce34c3aad50c8c/definitions/tools/bqsr.wdl#L257

malachig commented 1 month ago

Attempting to increase the size multiplier from 2.5 to 3.5

malachig commented 1 month ago

In this test, the VM seems to request 90GB total space.

malachig commented 1 month ago

This increase seems to have worked.