!/bin/bash

set -euo pipefail <- it fails if I use this!

Set common settings.

PROJECT_ID=ms-deepvariant OUTPUT_BUCKET=gs://ms_bam/recover STAGING_FOLDER_NAME=recover_tmp OUTPUT_FILE_NAME=recover.gvcf

Model for calling whole genome sequencing data.

MODEL=gs://deepvariant/models/DeepVariant/0.8.0/DeepVariant-inception_v3-0.8.0+data-wgs_standard IMAGE_VERSION=0.8.0 DOCKER_IMAGE=gcr.io/deepvariant-docker/deepvariant:"${IMAGE_VERSION}" COMMAND="/opt/deepvariant_runner/bin/gcp_deepvariant_runner \ --project ${PROJECT_ID} \ --zones europe-west1-* \ --docker_image ${DOCKER_IMAGE} \ --outfile ${OUTPUT_BUCKET}/${OUTPUT_FILE_NAME} \ --gvcf_outfile ${OUTPUT_BUCKET}/${OUTPUT_FILE_NAME} \ --staging ${OUTPUT_BUCKET}/${STAGING_FOLDER_NAME} \ --model ${MODEL} \ --bam gs://ms_bam/NoDup_FB4.bam \ --bai gs://ms_bam/NoDup_FB4.bam.bai \ --ref gs://ms_bam/Homo_sapiens_assembly38.fasta \ --shards 512 \ --make_examples_workers 32 \ --make_examples_cores_per_worker 16 \ --make_examples_ram_per_worker_gb 60 \ --make_examples_disk_per_worker_gb 200 \ --call_variants_workers 32 \ --call_variants_cores_per_worker 32 \ --call_variants_ram_per_worker_gb 60 \ --call_variants_disk_per_worker_gb 50 \ --postprocess_variants_disk_gb 200 \ --gcsfuse "

Run the pipeline.

gcloud alpha genomics pipelines run \ --project "${PROJECT_ID}" \ --service-account-scopes="https://www.googleapis.com/auth/cloud-platform" \ --logging "${OUTPUT_BUCKET}/${STAGING_FOLDER_NAME}/runnerlogs$(date +%Y%m%d_%H%M%S).log" \ --regions europe-west1 \ --docker-image gcr.io/cloud-genomics-pipelines/gcp-deepvariant-runner \ --command-line "${COMMAND}"

And i get the following error:

07:03:22 Stopped running "-c timeout=10; elapsed=0; seq \"${SHARD_START_INDEX}\" \"${SHARD_END_INDEX}\" | parallel --halt 2 \"mkdir -p ./input-gcsfused-{} && gcsfuse --implicit-dirs \"${GCS_BUCKET}\" /input-gcsfused-{}\" && seq \"${SHARD_START_INDEX}\" \"${SHARD_END_INDEX}\" | parallel --halt 2 \"until mountpoint -q /input-gcsfused-{}; do test \"${elapsed}\" -lt \"${timeout}\" || fail \"Time out waiting for gcsfuse mount points\"; sleep 1; elapsed=$((elapsed+1)); done\" && seq \"${SHARD_START_INDEX}\" \"${SHARD_END_INDEX}\" | parallel --halt 2 \"/opt/deepvariant/bin/make_examples --mode calling --examples \"${EXAMPLES}\"/examples_output.tfrecord@\"${SHARDS}\".gz --reads \"/input-gcsfused-{}/${BAM}\" --ref \"${INPUT_REF}\" --task {} --gvcf \"${GVCF}\"/gvcf_output.tfrecord@\"${SHARDS}\".gz\"": exit status 127: bash: gcsfuse: command not found.

Is it possible to identify the problem/typo?

samanvp commented 5 years ago

This is an issue of DeepVariantRunner. We will be releasing a new docker image later this week that will resolve it.

gunjanbaid commented 5 years ago

Thanks @samanvp!

@HagenC in the meantime, you can also run DeepVariant v0.8.0 using the Docker image or prebuilt binaries. Here are links to case studies that show how you can run using Docker or binaries. Note: we recommend running the binaries on an Ubuntu 16.04 machine.

samanvp commented 5 years ago

We just released a new docker image located at gcr.io/cloud-lifesciences/gcp-deepvariant-runner. Please let us know if you still observe gcsfuse issue using the latest release.

google / deepvariant

exit status 127: bash: gcsfuse: command not found #214

!/bin/bash

set -euo pipefail <- it fails if I use this!

Set common settings.

Model for calling whole genome sequencing data.

Run the pipeline.