wustl-oncology / analysis-wdls

Scalable genomic analysis pipelines, written in WDL
MIT License
5 stars 11 forks source link

optimize CPU/RAM requests #33

Open chrisamiller opened 2 years ago

chrisamiller commented 2 years ago

Several related issues here: 1) if you create a custom N1 VM, it can have up to 6.5 GB of memory per vCPU. So for example when we select 32GB for mutect step, this forces us to request 6 CPUs. A list of places where we've been bumped up to more cores is here: /storage1/fs1/mgriffit/Active/griffithlab/pipeline_test/gcp_wdl_test/saved_results/final_results_v1_fusions_ens95/workflow_artifacts/extra_cpu_requests.txt

2) WGS runs require more resources than exomes in many cases, but our memory/CPU values are set so that the largest data sets will run. Either a) Provide a parameter that allows for specifying WGS or Exome at the top level or b) use the bam size directly to estimate mem usage in some of these steps.

malachig commented 2 years ago

To find these messages about CPU requirements being adjusted in the cromwell log:

journalctl -u cromwell | grep -i "GCE"

malachig commented 2 years ago

GCP Docs on custom instances: https://cloud.google.com/compute/docs/instances/creating-instance-with-custom-machine-type