googlegenomics / pipelines-tools

Tools for developing and running pipelines with the Genomics API
Apache License 2.0
24 stars 11 forks source link

When receiving error 400 from the API, `pipelines` does not appear to print out the request body #97

Open samanvp opened 5 years ago

samanvp commented 5 years ago

While using DeepVariant runner if I set the -make_examples_worker=256, some of the workers starts while some fail with the following error message:

"run": starting pipeline: googleapi: got HTTP response code 400 with body: . Job args: ['pipelines', '--project', 'project-name', 'run', '--attempts', '2', '--pvm-attempts', '0', '--boot-disk-size', '50', '--output-interval', '60s', '--zones', 'us-central1-a', '--set', 'SHARDS=256', '--set', 'SHARD_START_INDEX=249', '--set', 'SHARD_END_INDEX=249', '--set', 'GCS_BUCKET=bucket-name', '--set', 'BAM=staging-convert_1000g_high_cov_alignment-16-42/cram_to_bam/converted_bam_only_mapped_reads.bam', '--name', 'high-cov-s256-w256-c1make_examples', '--vm-labels', 'dv-job-name=high-cov-s256-w256-c1make_examples', '--image', 'gcr.io/deepvariant-docker/deepvariant:0.7.2', '--output', 'gs://bucket-name/profiler/high-cov-s256-w256-c1/logs/make_examples/249', '--inputs', 'INPUT_BAI=gs://bucket-name/staging-convert_1000g_high_cov_alignment-16-42/cram_to_bam/converted_bam_only_mapped_reads.bam.bai,INPUT_REF=gs://bucket-name/refs/Homo_sapiens_assembly38.fasta,INPUT_REF_FAI=gs://bucket-name/refs/Homo_sapiens_assembly38.fasta.fai', '--outputs', 'EXAMPLES=gs://bucket-name/profiler/high-cov-s256-w256-c1/examples/0/*', '--machine-type', 'custom-1-4096', '--disk-size', '200', '--command', '\ntimeout=10;\nelapsed=0;\nseq "${SHARD_START_INDEX}" "${SHARD_END_INDEX}" | parallel --halt 2\n "mkdir -p ./input-gcsfused-{} &&\n gcsfuse --implicit-dirs "${GCS_BUCKET}" /input-gcsfused-{}" &&\nseq "${SHARD_START_INDEX}" "${SHARD_END_INDEX}" | parallel --halt 2\n "until mountpoint -q /input-gcsfused-{}; do\n test "${elapsed}" -lt "${timeout}" || fail "Time out waiting for gcsfuse mount points";\n sleep 1;\n elapsed=$((elapsed+1));\n done" &&\nseq "${SHARD_START_INDEX}" "${SHARD_END_INDEX}" | parallel --halt 2\n "/opt/deepvariant/bin/make_examples\n --mode calling\n --examples "${EXAMPLES}"/examples_output.tfrecord@"${SHARDS}".gz\n --reads "/input-gcsfused-{}/${BAM}"\n --ref "${INPUT_REF}"\n --task {}\n " # ENABLE_FUSE\n'] [02/27/2019 20:18:15 ERROR gcp_deepvariant_runner.py] For more information, consult the worker log at gs://bucket-name/profiler/high-cov-s256-w256-c1/logs/make_examples/249 [02/27/2019 20:18:15 ERROR gcp_deepvariant_runner.py] For more information, consult the worker log at gs://bucket-name/profiler/high-cov-s256-w256-c1/logs/make_examples/221 [02/27/2019 20:18:15 ERROR gcp_deepvariant_runner.py] For more information, consult the worker log at gs://bucket-name/profiler/high-cov-s256-w256-c1/logs/make_examples/252 ...

None of mentioned log files exist.

kemp-google commented 5 years ago

As noted in the email thread, the log files themselves won't exist because the operation wasn't accepted by the API. The thing I want to investigate is whether the tool didn't print the 400 response body or there wasn't one. There's a separate issue that should be filed against the DV runner to not emit that log message when the operation isn't started successfully.