Closed obsh closed 5 years ago
Unfortunately each worker needs external IP to communicate back with the main runner. What I suggest is:
make_examples
stage you can use 2 workers each with 16 cores and 4*16=48GB of memory. Using larger workers cost you more than using smaller workers which is more cost optimized way to run DeepVariant.Unfortunately there is not a perfect solution; you need to compromise either cost or time.
Please let me know if you need help with setting the input argument to optimize the cost based on the size of the BAM file and the type of analysis.
Thank you for recommendations! I’ll try to run with larger worker machines.
Sure, will appreciate if you could give any suggestions on the run configuration. I’m working on a cannabis variants project with a Googler @allenday and I think the goal is to optimize for smaller overall running time. We have 16,000 BAM files with sizes in the range from 60MB to 17GB and reference fa files from 300MB - 1.2GB. We need to produce vcf files. From experience of running a couple of pipelines we selected make example worker machines with a 60GB RAM and 10 CPU as VMs were failing with "out of memory" error when using with less RAM.
All arguments to the runner:
cmd: |
./opt/deepvariant_runner/bin/gcp_deepvariant_runner \
--project "${PROJECT_ID}" \
--zones "${ZONES}" \
--docker_image "${DOCKER_IMAGE}" \
--docker_image_gpu "${DOCKER_IMAGE_GPU}" \
--gpu \
--outfile "${OUTPUT_BUCKET}"/"${OUTPUT_FILE_NAME}" \
--staging "${OUTPUT_BUCKET}"/"${STAGING_FOLDER_NAME}" \
--model "${MODEL}" \
--ref "${INPUT_REF}" \
--bam "${INPUT_BAM}" \
--shards 512 \
--make_examples_workers 16 \
--make_examples_cores_per_worker 10 \
--make_examples_ram_per_worker_gb 60 \
--make_examples_disk_per_worker_gb 200 \
--call_variants_workers 16 \
--call_variants_cores_per_worker 8 \
--call_variants_ram_per_worker_gb 30 \
--call_variants_disk_per_worker_gb 50
With following model and images:
MODEL=gs://deepvariant/models/DeepVariant/0.6.0/DeepVariant-inception_v3-0.6.0+cl-191676894.data-wgs_standard
IMAGE_VERSION=0.6.1
DOCKER_IMAGE=gcr.io/deepvariant-docker/deepvariant:"${IMAGE_VERSION}"
DOCKER_IMAGE_GPU=gcr.io/deepvariant-docker/deepvariant_gpu:"${IMAGE_VERSION}"
Here are a couple of small changes that will definitely makes your run more efficient:
shards
to be equal to make_examples_workers
times make_examples_cores_per_worker
, basically one shard per core.--make_examples_workers 1
for all 3 groups (to save on external IPs) and --make_examples_cores_per_worker
4, 8, and 16 respectively for three buckets.make_examples
and call_variants
step. However, it seems for your case this was not enough and you ended up 6GB per core.call_variants
step you are wasting way too much resources. What we recommend in our automatic flag values (pending PR #11) for BAM files up to 200GB is 2 workers equipped with GPU. Here I recommend 1 worker with GPU for all BAM sizes.call_variants
you don't need many cores because GPU will be doing all the heaving lifting. What we recommend is to use 4 cores and 4*4=16GB ram workers equipped with GPU for this stage.I just want to mention that all my experience of optimizing these flags is for human sample BAM files. I am not really sure what is the density of variants in cannabis. So you might want to apply some fine tuning on top of what I suggested.
Please let me know if there is anything else I can help with.
Thank you very much for the recommendations and explanation of logic behind it! I'll try to run a new setup this week.
Hi,
I wonder if there is an option to create worker machines without external IP addresses? I'm Trying to run large number of pipelines in GCP and stuck with IP address quota.
Regards.