gatk-workflows / five-dollar-genome-analysis-pipeline

Workflows used for WGS data processing -- replaced by https://github.com/gatk-workflows/gatk4-genome-processing-pipeline
https://gatk.broadinstitute.org/hc/en-us
BSD 3-Clause "New" or "Revised" License
57 stars 45 forks source link

The flow with example files fails #25

Open mgcam opened 5 years ago

mgcam commented 5 years ago

verifyBamID used in the docker container seems to be verifyBamID2, which is known to be much less tolerant to low depth. The workflow with default sample files fails at the contamination detection stage.

moschetti commented 5 years ago

Also seeing the workflow failing on CheckContamination.

Get warnings for insufficient markings and then errors during delocalization, although not sure if those are related. The following is from the end of CheckContamination.log:

[...] NOTICE - Process chr22:50745507-50745507... NOTICE - Process chr22:50774185-50774185... NOTICE - Number of marker in Reference Matrix:99976 NOTICE - Number of marker shared with input file:331 NOTICE - Mean Depth:1.003021 NOTICE - SD Depth:0.054882 NOTICE - 330 SNP markers remained after sanity check.

WARNING - Insufficient Available markers, check input bam depth distribution in output pileup file after specifying --OutputPileup 2019/08/21 22:33:10 Starting delocalization. 2019/08/21 22:33:11 Delocalizing output /cromwell_root/NA12878_PLUMBING.preBqsr.selfSM -> gs:////work/WholeGenomeGermlineSingleSample/795f578d-3944-406f-b15e-933c4043f65e/call-UnmappedBamToAlignedBam/UnmappedBamToAlignedBam/7197d913-456d-411d-bf05-1164c1549aeb/call-CheckContamination/NA12878_PLUMBING.preBqsr.selfSM 2019/08/21 22:33:14 rm -f $HOME/.config/gcloud/gce && gsutil cp /cromwell_root/NA12878_PLUMBING.preBqsr.selfSM gs:////work/WholeGenomeGermlineSingleSample/795f578d-3944-406f-b15e-933c4043f65e/call-UnmappedBamToAlignedBam/UnmappedBamToAlignedBam/7197d913-456d-411d-bf05-1164c1549aeb/call-CheckContamination/ failed CommandException: No URLs matched: /cromwell_root/NA12878_PLUMBING.preBqsr.selfSM 2019/08/21 22:33:14 Waiting 5 seconds and retrying 2019/08/21 22:33:20 rm -f $HOME/.config/gcloud/gce && gsutil cp /cromwell_root/NA12878_PLUMBING.preBqsr.selfSM gs:////work/WholeGenomeGermlineSingleSample/795f578d-3944-406f-b15e-933c4043f65e/call-UnmappedBamToAlignedBam/UnmappedBamToAlignedBam/7197d913-456d-411d-bf05-1164c1549aeb/call-CheckContamination/ failed CommandException: No URLs matched: /cromwell_root/NA12878_PLUMBING.preBqsr.selfSM 2019/08/21 22:33:20 Waiting 5 seconds and retrying 2019/08/21 22:33:26 rm -f $HOME/.config/gcloud/gce && gsutil cp /cromwell_root/NA12878_PLUMBING.preBqsr.selfSM gs:////work/WholeGenomeGermlineSingleSample/795f578d-3944-406f-b15e-933c4043f65e/call-UnmappedBamToAlignedBam/UnmappedBamToAlignedBam/7197d913-456d-411d-bf05-1164c1549aeb/call-CheckContamination/ failed CommandException: No URLs matched: /cromwell_root/NA12878_PLUMBING.preBqsr.selfSM 2019/08/21 22:33:28 Delocalizing output /cromwell_root/stdout -> gs:////work/WholeGenomeGermlineSingleSample/795f578d-3944-406f-b15e-933c4043f65e/call-UnmappedBamToAlignedBam/UnmappedBamToAlignedBam/7197d913-456d-411d-bf05-1164c1549aeb/call-CheckContamination/stdout 2019/08/21 22:33:32 Delocalizing output /cromwell_root/stderr -> gs:////work/WholeGenomeGermlineSingleSample/795f578d-3944-406f-b15e-933c4043f65e/call-UnmappedBamToAlignedBam/UnmappedBamToAlignedBam/7197d913-456d-411d-bf05-1164c1549aeb/call-CheckContamination/stderr 2019/08/21 22:33:37 Delocalizing output /cromwell_root/rc -> gs:////work/WholeGenomeGermlineSingleSample/795f578d-3944-406f-b15e-933c4043f65e/call-UnmappedBamToAlignedBam/UnmappedBamToAlignedBam/7197d913-456d-411d-bf05-1164c1549aeb/call-CheckContamination/rc

mgcam commented 5 years ago

Yes this is exactly the error I had. Most likely, veryfyBamID exist abnormally without creating the expected output - *.selfSM file. What you see is an error to copy the file back to the bucket.

ghost commented 4 years ago

Any updates on this? I get the same error.