bcbio / bcbio-nextgen

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis
https://bcbio-nextgen.readthedocs.io
MIT License
986 stars 354 forks source link

Ensemble variant calling and ploidy(?) problem #194

Closed mjafin closed 10 years ago

mjafin commented 10 years ago

Hi again, I'm trying to use the ensemble variant calling approach and am hitting the following problem:

[2013-12-02 18:03] ukapdlnx117: INFO  18:03:35,710 GenotypeConcordance - Eval or Comp Rod at position chr17:46622348 has multiple records. Resolving.
[2013-12-02 18:03] ukapdlnx117: INFO  18:03:42,132 GenotypeConcordance - Eval or Comp Rod at position chr20:29652102 has multiple records. Resolving.
[2013-12-02 18:03] ukapdlnx117: INFO  18:03:49,896 HttpMethodDirector - I/O exception (java.net.ConnectException) caught when processing request: Connection refused
[2013-12-02 18:03] ukapdlnx117: INFO  18:03:49,896 HttpMethodDirector - Retrying request
[2013-12-02 18:03] ukapdlnx117: INFO  18:03:49,902 HttpMethodDirector - I/O exception (java.net.ConnectException) caught when processing request: Connection refused
[2013-12-02 18:03] ukapdlnx117: INFO  18:03:49,902 HttpMethodDirector - Retrying request
[2013-12-02 18:03] ukapdlnx117: INFO  18:03:49,906 HttpMethodDirector - I/O exception (java.net.ConnectException) caught when processing request: Connection refused
[2013-12-02 18:03] ukapdlnx117: INFO  18:03:49,906 HttpMethodDirector - Retrying request
[2013-12-02 18:03] ukapdlnx117: INFO  18:03:49,912 HttpMethodDirector - I/O exception (java.net.ConnectException) caught when processing request: Connection refused
[2013-12-02 18:03] ukapdlnx117: INFO  18:03:49,912 HttpMethodDirector - Retrying request
[2013-12-02 18:03] ukapdlnx117: INFO  18:03:49,916 HttpMethodDirector - I/O exception (java.net.ConnectException) caught when processing request: Connection refused
[2013-12-02 18:03] ukapdlnx117: INFO  18:03:49,916 HttpMethodDirector - Retrying request
[2013-12-02 18:03] ukapdlnx117: Exception in thread "main" org.broadinstitute.sting.utils.exceptions.UserException: Concordance Metrics is currently only implemented for DIPLOID genotypes, found eval ploidy: 1, comp ploidy: 1
[2013-12-02 18:03] ukapdlnx117:         at org.broadinstitute.sting.gatk.walkers.variantutils.ConcordanceMetrics.update(ConcordanceMetrics.java:130)
[2013-12-02 18:03] ukapdlnx117:         at org.broadinstitute.sting.gatk.walkers.variantutils.GenotypeConcordance.reduce(GenotypeConcordance.java:318)
[2013-12-02 18:03] ukapdlnx117:         at org.broadinstitute.sting.gatk.walkers.variantutils.GenotypeConcordance.reduce(GenotypeConcordance.java:133)
[2013-12-02 18:03] ukapdlnx117:         at org.broadinstitute.sting.gatk.traversals.TraverseLociNano$TraverseLociReduce.apply(TraverseLociNano.java:291)
[2013-12-02 18:03] ukapdlnx117:         at org.broadinstitute.sting.gatk.traversals.TraverseLociNano$TraverseLociReduce.apply(TraverseLociNano.java:280)
[2013-12-02 18:03] ukapdlnx117:         at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:279)
[2013-12-02 18:03] ukapdlnx117:         at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:245)
[2013-12-02 18:03] ukapdlnx117:         at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:144)
[2013-12-02 18:03] ukapdlnx117:         at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:92)
[2013-12-02 18:03] ukapdlnx117:         at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:48)
[2013-12-02 18:03] ukapdlnx117:         at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:99)
[2013-12-02 18:03] ukapdlnx117:         at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:313)
[2013-12-02 18:03] ukapdlnx117:         at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113)
[2013-12-02 18:03] ukapdlnx117:         at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:245)
[2013-12-02 18:03] ukapdlnx117:         at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:152)
[2013-12-02 18:03] ukapdlnx117:         at bcbio.run.broad$run_gatk$fn__1094.invoke(broad.clj:34)
[2013-12-02 18:03] ukapdlnx117:         at bcbio.run.broad$run_gatk.invoke(broad.clj:31)
[2013-12-02 18:03] ukapdlnx117:         at bcbio.variation.evaluate$calc_variant_eval_metrics.doInvoke(evaluate.clj:22)
[2013-12-02 18:03] ukapdlnx117:         at clojure.lang.RestFn.invoke(RestFn.java:573)
[2013-12-02 18:03] ukapdlnx117:         at bcbio.variation.compare$compare_two_vcf_standard.invoke(compare.clj:161)
[2013-12-02 18:03] ukapdlnx117:         at bcbio.variation.compare$compare_two_vcf.invoke(compare.clj:183)
[2013-12-02 18:03] ukapdlnx117:         at bcbio.variation.compare$variant_comparison_from_config$iter__6966__6970$fn__6971$iter__6990__6994$fn__6995.invoke(compare.clj:253)
[2013-12-02 18:03] ukapdlnx117:         at clojure.lang.LazySeq.sval(LazySeq.java:42)
[2013-12-02 18:03] ukapdlnx117:         at clojure.lang.LazySeq.seq(LazySeq.java:60)
[2013-12-02 18:03] ukapdlnx117:         at clojure.lang.RT.seq(RT.java:484)
[2013-12-02 18:03] ukapdlnx117:         at clojure.core$seq.invoke(core.clj:133)
[2013-12-02 18:03] ukapdlnx117:         at clojure.core.protocols$seq_reduce.invoke(protocols.clj:30)
[2013-12-02 18:03] ukapdlnx117:         at clojure.core.protocols$fn__6026.invoke(protocols.clj:54)
[2013-12-02 18:03] ukapdlnx117:         at clojure.core.protocols$fn__5979$G__5974__5992.invoke(protocols.clj:13)
[2013-12-02 18:03] ukapdlnx117:         at clojure.core$reduce.invoke(core.clj:6177)
[2013-12-02 18:03] ukapdlnx117:         at bcbio.variation.multiple$prep_cmp_name_lookup.doInvoke(multiple.clj:39)
[2013-12-02 18:03] ukapdlnx117:         at clojure.lang.RestFn.invoke(RestFn.java:410)
[2013-12-02 18:03] ukapdlnx117:         at bcbio.variation.compare$finalize_comparisons.invoke(compare.clj:230)
[2013-12-02 18:03] ukapdlnx117:         at bcbio.variation.compare$variant_comparison_from_config$iter__6966__6970$fn__6971.invoke(compare.clj:254)
[2013-12-02 18:03] ukapdlnx117:         at clojure.lang.LazySeq.sval(LazySeq.java:42)
[2013-12-02 18:03] ukapdlnx117:         at clojure.lang.LazySeq.seq(LazySeq.java:60)
[2013-12-02 18:03] ukapdlnx117:         at clojure.lang.RT.seq(RT.java:484)
[2013-12-02 18:03] ukapdlnx117:         at clojure.core$seq.invoke(core.clj:133)
[2013-12-02 18:03] ukapdlnx117:         at clojure.core$tree_seq$walk__4647$fn__4648.invoke(core.clj:4475)
[2013-12-02 18:03] ukapdlnx117:         at clojure.lang.LazySeq.sval(LazySeq.java:42)
[2013-12-02 18:03] ukapdlnx117:         at clojure.lang.LazySeq.seq(LazySeq.java:60)
[2013-12-02 18:03] ukapdlnx117:         at clojure.lang.LazySeq.more(LazySeq.java:96)
[2013-12-02 18:03] ukapdlnx117:         at clojure.lang.RT.more(RT.java:607)
[2013-12-02 18:03] ukapdlnx117:         at clojure.core$rest.invoke(core.clj:73)
[2013-12-02 18:03] ukapdlnx117:         at clojure.core$flatten.invoke(core.clj:6478)
[2013-12-02 18:03] ukapdlnx117:         at bcbio.variation.compare$variant_comparison_from_config.invoke(compare.clj:251)
[2013-12-02 18:03] ukapdlnx117:         at bcbio.variation.ensemble$consensus_calls.invoke(ensemble.clj:87)
[2013-12-02 18:03] ukapdlnx117:         at bcbio.variation.ensemble$_main.doInvoke(ensemble.clj:105)
[2013-12-02 18:03] ukapdlnx117:         at clojure.lang.RestFn.applyTo(RestFn.java:137)
[2013-12-02 18:03] ukapdlnx117:         at clojure.core$apply.invoke(core.clj:617)
[2013-12-02 18:03] ukapdlnx117:         at bcbio.variation.core$_main.doInvoke(core.clj:35)
[2013-12-02 18:03] ukapdlnx117:         at clojure.lang.RestFn.applyTo(RestFn.java:137)
[2013-12-02 18:03] ukapdlnx117:         at bcbio.variation.core.main(Unknown Source)
[2013-12-02 18:04] ukapdlnx117: INFO  18:04:01,705 ProgressMeter -   chrY:13310502        5.53e+04   90.0 s       27.1 m    100.0%        90.0 s     0.0 s
[2013-12-02 18:04] ukapdlnx117: INFO  18:04:31,710 ProgressMeter -   chrY:13310502        5.53e+04  120.0 s       36.2 m    100.0%       120.0 s     0.0 s
[2013-12-02 18:05] ukapdlnx117: INFO  18:05:01,715 ProgressMeter -   chrY:13310502        5.53e+04    2.5 m       45.2 m    100.0%         2.5 m     0.0 s
[2013-12-02 18:05] ukapdlnx117: INFO  18:05:31,720 ProgressMeter -   chrY:13310502        5.53e+04    3.0 m       54.3 m    100.0%         3.0 m     0.0 s
[2013-12-02 18:06] ukapdlnx117: INFO  18:06:01,726 ProgressMeter -   chrY:13310502        5.53e+04    3.5 m       63.3 m    100.0%         3.5 m     0.0 s
[2013-12-02 18:06] ukapdlnx117: INFO  18:06:31,731 ProgressMeter -   chrY:13310502        5.53e+04    4.0 m       72.4 m    100.0%         4.0 m     0.0 s
[2013-12-02 18:07] ukapdlnx117: INFO  18:07:01,736 ProgressMeter -   chrY:13310502        5.53e+04    4.5 m       81.4 m    100.0%         4.5 m     0.0 s
[2013-12-02 18:07] ukapdlnx117: INFO  18:07:31,742 ProgressMeter -   chrY:13310502        5.53e+04    5.0 m       90.4 m    100.0%         5.0 m     0.0 s

The last command I can see in the commands log is:

[2013-12-02 17:49] ukapdlnx117: java -Xms750m -Xmx8g -Djava.io.tmpdir=/gpfs02/ngs/oncology/analysis/bcbio-analysis/work/ensemble/ARH77_pe/tmp -jar /apps/bcbio-nextgen/0.7.5/rhel6-x64/share/java/bcbio_variation/bcbio.variation-0.1.1-standalone.jar variant-ensemble /gpfs02/ngs/oncology/analysis/bcbio-analysis/work/ensemble/ARH77_pe/config/ARH77_pe-ensemble.yaml /ngs/reference_data/genomes/Hsapiens/hg19/seq/hg19.fa /gpfs02/ngs/oncology/analysis/bcbio-analysis/work/ensemble/ARH77_pe/ARH77_pe-ensemble.vcf /gpfs02/ngs/oncology/analysis/bcbio-analysis/work/gatk/1_131128_bcbio-analysis-sort-variants-ploidyfix-combined-effects.vcf /gpfs02/ngs/oncology/analysis/bcbio-analysis/work/freebayes/1_131128_bcbio-analysis-sort-variants-ploidyfix-filter-effects.vcf /gpfs02/ngs/oncology/analysis/bcbio-analysis/work/gatk-haplotype/1_131128_bcbio-analysis-sort-variants-ploidyfix-combined-effects.vcf

Any pointers ?-)

chapmanb commented 10 years ago

Miika; Sorry, this a known issue that I still need to fix. It's come up since we're finally trying to deal with the sex and mitochondrial chromosomes correctly. I'll try to give this some time over the next couple of days and push a fix. We need to be treating NA12878 correctly and not calling on the Y chromosome since she's female, and also avoid GATK's ConcordanceMetrics walker for the mitochondria. Thanks again for the report.

mjafin commented 10 years ago

Oh, great, thanks Brad! This is actually one of our cell line samples but I did hit the same problem with NA12878 last week.

chapmanb commented 10 years ago

Miika; Thanks again for the report. We pushed a fix to bcbio.variation and the new version handles the non-diploid portions of the genome cleanly. If you pull the latest tools with:

bcbio_nextgen.py upgrade --tools

It should work cleanly now. Thanks again.

caddymob commented 10 years ago

Hey @chapmanb - I still have this issue. I pulled down the bcbio_nextgen.py upgrade --tools Jan 7 - a month after this was closed.

bcbio is using GATK 2.3-9-gdcdccbb (over a year old now) and this version does not support the ploidy options. I saw #117 about adding our own GATK. Is this a better option or is there something else we missed in running the update?

I manually ran the UnifiedGenotyper command generated by bcbio using my own copy of GATK v2.7-4-g6f46d11 (and had to changed the option DepthOfCoverage-> Coverage) and it works! It seems placing my own GATK version will cause problems for bcbio-nextgen as GATK option parameters are not static, eg 'DepthOfCoverage' in 2.3.9 vs 'Coverage' in v2.7.4

Any guidance? Thanks as always!

chapmanb commented 10 years ago

Jason; Unfortunately, 2.3.9 is the latest version of GATK we're allowed to automatically distribute because of their new licensing. We do have a manual intervention required way to install the most recent version:

bcbio_nextgen.py upgrade --tools --toolplus protected

That will give you instructions on downloading and install it in the right place. After you set this once for the installation, it will automatically update with new versions when you run --tools in the future. More info on the toolplus options is here:

https://bcbio-nextgen.readthedocs.org/en/latest/contents/installation.html#upgrade

Also, I wouldn't still expect errors with ploidy and bcbio.variation, independent of the version of GATK. What version of bcbio.variation does provenance/programs.txt report? Thanks much for the report.

caddymob commented 10 years ago

Here is the provenance/programs.txt - working on the update to protected now. Thanks, sorry I somehow missed that obvious detail!

~> cat ./provenance/programs.txt
bcbio-nextgen,0.7.5
htseq,0.5.4p5
bamtools,2.3.0
bedtools,v2.18.1
bowtie2,2.1.0
bwa,0.7.5a-r405
cufflinks,/packages/bcbio/0.7.4/bin/cufflinks)
cutadapt,
fastqc,v0.10.1
freebayes,v0.9.10-3-g47a713e-dirty
gemini,0.6.3.1
novosort,V1.02.01
novoalign,V3.02.00
samtools,0.1.19-44428cd
qualimap,v.0.7.1
tophat,v2.0.9
bcbio.variation,0.1.2
gatk,2.3-9-gdcdccbb
mutect,1.1.5
picard,1.96
rnaseqc,_v1.1.7
varscan,v2.3.6
jpeden1 commented 10 years ago

The update is done. Reran the whole-genome example you give in your docs. It fails. When I first installed bcbio a couple of months ago I got both the examples to run and work. Have been trying to get either one of them to work again for some time now. Are these eamples known to no longer work?? Do you have a small example that does/should work?

Here is a bit of the output from the whole-genome example:

[2014-01-13 18:22] INFO 18:22:54,049 GenomeAnalysisEngine - Preparing for traversal over 1 BAM files [2014-01-13 18:22] INFO 18:22:54,088 GenomeAnalysisEngine - Done preparing for traversal [2014-01-13 18:22] INFO 18:22:54,088 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] [2014-01-13 18:22] INFO 18:22:54,088 ProgressMeter - Location processed.sites runtime per.1M.sites completed total.runtime remaining [2014-01-13 18:22] WARN 18:22:55,031 RestStorageService - Error Response: PUT '/qnMDKyapLYOpT6oKYMo7IGWnRHArojP2.report.xml.gz' -- ResponseCode: 403, ResponseStatus: Forbidden, Request Headers: [Content-Length: 399, Content-MD5: R4PceRmQJtFfSxKKPnBhxA==, Content-Type: application/octet-stream, x-amz-meta-md5-hash: 4783dc79199026d15f4b128a3e7061c4, Date: Tue, 14 Jan 2014 01:22:54 GMT, Authorization: AWS AKIAI22FBBJ37D5X62OQ:e4qQUYjIOn2W8UkkjXjJGUAljqo=, User-Agent: JetS3t/0.8.1 (Linux/2.6.32-279.el6.x86_64; amd64; en; JVM 1.7.0_45), Host: broad.gsa.gatk.run.reports.s3.amazonaws.com, Expect: 100-continue], Response Headers: [x-amz-request-id: B4FE7D0BFD472CAD, x-amz-id-2: xQ4lNR7nuK9fst5FdEOgp+M5+rU7YR9sLo7dV7QC4nWJQ2XuuGIq7bDIuiE2KwL9, Content-Type: application/xml, Transfer-Encoding: chunked, Date: Mon, 13 Jan 2014 18:32:54 GMT, Connection: close, Server: AmazonS3] [2014-01-13 18:22] WARN 18:22:55,206 RestStorageService - Adjusted time offset in response to RequestTimeTooSkewed error. Local machine and S3 server disagree on the time by approximately -24600 seconds. Retrying connection. [2014-01-13 18:22] INFO 18:22:55,620 GATKRunReport - Uploaded run statistics report to AWS S3 [2014-01-13 18:22] INFO 18:22:55,647 HelpFormatter - -------------------------------------------------------------------------------- [2014-01-13 18:22] INFO 18:22:55,649 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.8-1-g932cd3a, Compiled 2013/12/06 16:47:15 [2014-01-13 18:22] INFO 18:22:55,649 HelpFormatter - Copyright (c) 2010 The Broad Institute [2014-01-13 18:22] INFO 18:22:55,649 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk [2014-01-13 18:22] INFO 18:22:55,654 HelpFormatter - Program Args: -T RealignerTargetCreator -I /scratch/jpeden/test_whole_genome/work/bamprep/NA12878/GL000220.1/1_130327_iplat-sort-GL000220.1_0_161802-prep-prealign.bam -R /packages/bcbio/0.7.4/genomes/Hsapiens/GRCh37/seq/GRCh37.fa -o /scratch/jpeden/test_whole_genome/work/bamprep/NA12878/GL000220.1/tx/tmpR7hUuc/1_130327_iplat-sort-GL000220.1_0_161802-prep-prealign-realign.intervals -l INFO -L GL000220.1:1-161802 --interval_set_rule INTERSECTION --known /packages/bcbio/0.7.4/genomes/Hsapiens/GRCh37/variation/dbsnp_137.vcf -U LENIENT_VCF_PROCESSING --read_filter BadCigar --read_filter NotPrimaryAlignment [2014-01-13 18:22] INFO 18:22:55,655 HelpFormatter - Date/Time: 2014/01/13 18:22:55 [2014-01-13 18:22] INFO 18:22:55,655 HelpFormatter - -------------------------------------------------------------------------------- [2014-01-13 18:22] INFO 18:22:55,656 HelpFormatter - -------------------------------------------------------------------------------- [2014-01-13 18:22] Index BAM file: 1_130327_iplat-sort-GL000219.1_0_179198-prep.bam [2014-01-13 18:22] INFO 18:22:55,673 ArgumentTypeDescriptor - Dynamically determined type of /packages/bcbio/0.7.4/genomes/Hsapiens/GRCh37/variation/dbsnp_137.vcf to be VCF [2014-01-13 18:22] INFO 18:22:55,776 GenomeAnalysisEngine - Strictness is SILENT [2014-01-13 18:22] INFO 18:22:55,989 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000 [2014-01-13 18:22] INFO 18:22:56,007 SAMDataSource$SAMReaders - Initializing SAMRecords in serial [2014-01-13 18:22] INFO 18:22:56,046 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.04 [2014-01-13 18:22] INFO 18:22:56,104 RMDTrackBuilder - Loading Tribble index from disk for file /packages/bcbio/0.7.4/genomes/Hsapiens/GRCh37/variation/dbsnp_137.vcf [2014-01-13 18:22] INFO 18:22:56,375 IntervalUtils - Processing 161802 bp from intervals [2014-01-13 18:22] INFO 18:22:56,460 GenomeAnalysisEngine - Preparing for traversal over 1 BAM files [2014-01-13 18:22] INFO 18:22:56,517 GenomeAnalysisEngine - Done preparing for traversal [2014-01-13 18:22] INFO 18:22:56,518 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] [2014-01-13 18:22] INFO 18:22:56,519 ProgressMeter - Location processed.sites runtime per.1M.sites completed total.runtime remaining [2014-01-13 18:22] INFO 18:22:56,541 ProgressMeter - done 4.96e+04 9.0 s 3.3 m 99.6% 9.0 s 0.0 s [2014-01-13 18:22] INFO 18:22:56,541 ProgressMeter - Total runtime 9.85 secs, 0.16 min, 0.00 hours [2014-01-13 18:22] INFO 18:22:56,593 HelpFormatter - -------------------------------------------------------------------------------- [2014-01-13 18:22] INFO 18:22:56,596 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.8-1-g932cd3a, Compiled 2013/12/06 16:47:15 [2014-01-13 18:22] INFO 18:22:56,597 HelpFormatter - Copyright (c) 2010 The Broad Institute [2014-01-13 18:22] INFO 18:22:56,598 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk [2014-01-13 18:22] INFO 18:22:56,602 HelpFormatter - Program Args: -T RealignerTargetCreator -I /scratch/jpeden/test_whole_genome/work/bamprep/NA12878/GL000192.1/1_130327_iplat-sort-GL000192.1_0_547496-prep-prealign.bam -R /packages/bcbio/0.7.4/genomes/Hsapiens/GRCh37/seq/GRCh37.fa -o /scratch/jpeden/test_whole_genome/work/bamprep/NA12878/GL000192.1/tx/tmp4sEqOW/1_130327_iplat-sort-GL000192.1_0_547496-prep-prealign-realign.intervals -l INFO -L GL000192.1:1-547496 --interval_set_rule INTERSECTION --known /packages/bcbio/0.7.4/genomes/Hsapiens/GRCh37/variation/dbsnp_137.vcf -U LENIENT_VCF_PROCESSING --read_filter BadCigar --read_filter NotPrimaryAlignment [2014-01-13 18:22] INFO 18:22:56,603 HelpFormatter - Date/Time: 2014/01/13 18:22:56 [2014-01-13 18:22] INFO 18:22:56,604 HelpFormatter - -------------------------------------------------------------------------------- [2014-01-13 18:22] INFO 18:22:56,604 HelpFormatter - -------------------------------------------------------------------------------- [2014-01-13 18:22] INFO 18:22:56,625 ArgumentTypeDescriptor - Dynamically determined type of /packages/bcbio/0.7.4/genomes/Hsapiens/GRCh37/variation/dbsnp_137.vcf to be VCF [2014-01-13 18:22] INFO 18:22:56,641 MicroScheduler - 0 reads were filtered out during the traversal out of approximately 49593 total reads (0.00%) [2014-01-13 18:22] INFO 18:22:56,642 MicroScheduler - -> 0 reads (0.00% of total) failing BadCigarFilter [2014-01-13 18:22] INFO 18:22:56,642 MicroScheduler - -> 0 reads (0.00% of total) failing MalformedReadFilter [2014-01-13 18:22] INFO 18:22:56,643 MicroScheduler - -> 0 reads (0.00% of total) failing NotPrimaryAlignmentFilter [2014-01-13 18:22] INFO 18:22:56,800 GenomeAnalysisEngine - Strictness is SILENT [2014-01-13 18:22] INFO 18:22:56,933 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000 [2014-01-13 18:22] INFO 18:22:56,941 SAMDataSource$SAMReaders - Initializing SAMRecords in serial [2014-01-13 18:22] INFO 18:22:57,000 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.06 [2014-01-13 18:22] INFO 18:22:57,054 RMDTrackBuilder - Loading Tribble index from disk for file /packages/bcbio/0.7.4/genomes/Hsapiens/GRCh37/variation/dbsnp_137.vcf [2014-01-13 18:22] INFO 18:22:57,291 IntervalUtils - Processing 547496 bp from intervals [2014-01-13 18:22] INFO 18:22:57,394 GenomeAnalysisEngine - Preparing for traversal over 1 BAM files [2014-01-13 18:22] INFO 18:22:57,452 GenomeAnalysisEngine - Done preparing for traversal [2014-01-13 18:22] INFO 18:22:57,460 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] [2014-01-13 18:22] INFO 18:22:57,461 ProgressMeter - Location processed.sites runtime per.1M.sites completed total.runtime remaining [2014-01-13 18:22] WARN 18:22:57,641 RestStorageService - Error Response: PUT '/JO296pbVNeaqzWqWzHGVWqkWmSSiXVxZ.report.xml.gz' -- ResponseCode: 403, ResponseStatus: Forbidden, Request Headers: [Content-Length: 400, Content-MD5: 6p0KOLGkck/Wxj4CTo+Vcw==, Content-Type: application/octet-stream, x-amz-meta-md5-hash: ea9d0a38b1a4724fd6c63e024e8f9573, Date: Tue, 14 Jan 2014 01:22:56 GMT, Authorization: AWS AKIAI22FBBJ37D5X62OQ:btIGyc3iBiNPyiIgz4xFUkWiLUU=, User-Agent: JetS3t/0.8.1 (Linux/2.6.32-279.el6.x86_64; amd64; en; JVM 1.7.0_45), Host: broad.gsa.gatk.run.reports.s3.amazonaws.com, Expect: 100-continue], Response Headers: [x-amz-request-id: DFB1B1AF406FC851, x-amz-id-2: OCU15huPaJ7Ftl9w17p/ocr0XH8oqd6oNNnL2eLqgzSvvNbh1Ppz+hpBHWxoF2T4, Content-Type: application/xml, Transfer-Encoding: chunked, Date: Mon, 13 Jan 2014 18:32:57 GMT, Connection: close, Server: AmazonS3] [2014-01-13 18:22] WARN 18:22:57,819 RestStorageService - Adjusted time offset in response to RequestTimeTooSkewed error. Local machine and S3 server disagree on the time by approximately -24599 seconds. Retrying connection. [2014-01-13 18:22] INFO 18:22:58,228 GATKRunReport - Uploaded run statistics report to AWS S3 [2014-01-13 18:22] Index BAM file: 1_130327_iplat-sort-GL000212.1_0_186858-prep.bam [2014-01-13 18:22] INFO 18:22:59,136 ProgressMeter - done 6.69e+04 7.0 s 114.0 s 99.6% 7.0 s 0.0 s [2014-01-13 18:22] INFO 18:22:59,137 ProgressMeter - Total runtime 7.63 secs, 0.13 min, 0.00 hours [2014-01-13 18:22] INFO 18:22:59,220 MicroScheduler - 0 reads were filtered out during the traversal out of approximately 66882 total reads (0.00%) [2014-01-13 18:22] INFO 18:22:59,221 MicroScheduler - -> 0 reads (0.00% of total) failing BadCigarFilter [2014-01-13 18:22] INFO 18:22:59,222 MicroScheduler - -> 0 reads (0.00% of total) failing MalformedReadFilter [2014-01-13 18:22] INFO 18:22:59,223 MicroScheduler - -> 0 reads (0.00% of total) failing NotPrimaryAlignmentFilter [2014-01-13 18:23] WARN 18:23:00,237 RestStorageService - Error Response: PUT '/VyvtTjJQSjedoknoYJfzwhMiK9xpqyU6.report.xml.gz' -- ResponseCode: 403, ResponseStatus: Forbidden, Request Headers: [Content-Length: 398, Content-MD5: Hgxj6rj86LQDRQsKhp4vTg==, Content-Type: application/octet-stream, x-amz-meta-md5-hash: 1e0c63eab8fce8b403450b0a869e2f4e, Date: Tue, 14 Jan 2014 01:22:59 GMT, Authorization: AWS AKIAI22FBBJ37D5X62OQ:BqI1PQfIcTvhWe3WBrrO8+jlrUY=, User-Agent: JetS3t/0.8.1 (Linux/2.6.32-279.el6.x86_64; amd64; en; JVM 1.7.0_45), Host: broad.gsa.gatk.run.reports.s3.amazonaws.com, Expect: 100-continue], Response Headers: [x-amz-request-id: 0E65101EA6FFEE0A, x-amz-id-2: 74AjxHrSfgUelRzyXwkmFUxYr8znc7/kHhK6xxdPxe0Viy2GDBVFaNkFw4MLRC74, Content-Type: application/xml, Transfer-Encoding: chunked, Date: Mon, 13 Jan 2014 18:33:00 GMT, Connection: close, Server: AmazonS3] [2014-01-13 18:23] WARN 18:23:00,405 RestStorageService - Adjusted time offset in response to RequestTimeTooSkewed error. Local machine and S3 server disagree on the time by approximately -24600 seconds. Retrying connection. [2014-01-13 18:23] INFO 18:23:00,773 GATKRunReport - Uploaded run statistics report to AWS S3 [2014-01-13 18:23] Index BAM file: 1_130327_iplat-sort-GL000193.1_0_189789-prep.bam [2014-01-13 18:23] INFO 18:23:00,907 ProgressMeter - done 5.47e+05 3.0 s 6.0 s 100.0% 3.0 s 0.0 s [2014-01-13 18:23] INFO 18:23:00,908 ProgressMeter - Total runtime 3.45 secs, 0.06 min, 0.00 hours [2014-01-13 18:23] INFO 18:23:00,989 MicroScheduler - 29855 reads were filtered out during the traversal out of approximately 73847 total reads (40.43%) [2014-01-13 18:23] INFO 18:23:00,990 MicroScheduler - -> 0 reads (0.00% of total) failing BadCigarFilter [2014-01-13 18:23] INFO 18:23:00,990 MicroScheduler - -> 156 reads (0.21% of total) failing BadMateFilter [2014-01-13 18:23] INFO 18:23:00,990 MicroScheduler - -> 101 reads (0.14% of total) failing DuplicateReadFilter [2014-01-13 18:23] INFO 18:23:00,990 MicroScheduler - -> 0 reads (0.00% of total) failing FailsVendorQualityCheckFilter [2014-01-13 18:23] INFO 18:23:00,991 MicroScheduler - -> 0 reads (0.00% of total) failing MalformedReadFilter [2014-01-13 18:23] INFO 18:23:00,991 MicroScheduler - -> 0 reads (0.00% of total) failing MappingQualityUnavailableFilter [2014-01-13 18:23] INFO 18:23:00,991 MicroScheduler - -> 29598 reads (40.08% of total) failing MappingQualityZeroFilter [2014-01-13 18:23] INFO 18:23:00,991 MicroScheduler - -> 0 reads (0.00% of total) failing NotPrimaryAlignmentFilter [2014-01-13 18:23] INFO 18:23:00,991 MicroScheduler - -> 0 reads (0.00% of total) failing Platform454Filter [2014-01-13 18:23] INFO 18:23:00,992 MicroScheduler - -> 0 reads (0.00% of total) failing UnmappedReadFilter [2014-01-13 18:23] INFO 18:23:01,977 ProgressMeter - done 2.11e+05 7.0 s 37.0 s 100.0% 7.0 s 0.0 s [2014-01-13 18:23] INFO 18:23:01,978 ProgressMeter - Total runtime 7.89 secs, 0.13 min, 0.00 hours [2014-01-13 18:23] INFO 18:23:01,979 MicroScheduler - 115673 reads were filtered out during the traversal out of approximately 354885 total reads (32.59%) [2014-01-13 18:23] INFO 18:23:01,980 MicroScheduler - -> 0 reads (0.00% of total) failing BadCigarFilter [2014-01-13 18:23] INFO 18:23:01,980 MicroScheduler - -> 19444 reads (5.48% of total) failing BadMateFilter [2014-01-13 18:23] INFO 18:23:01,980 MicroScheduler - -> 7889 reads (2.22% of total) failing DuplicateReadFilter [2014-01-13 18:23] INFO 18:23:01,980 MicroScheduler - -> 0 reads (0.00% of total) failing FailsVendorQualityCheckFilter [2014-01-13 18:23] INFO 18:23:01,980 MicroScheduler - -> 0 reads (0.00% of total) failing MalformedReadFilter [2014-01-13 18:23] INFO 18:23:01,981 MicroScheduler - -> 0 reads (0.00% of total) failing MappingQualityUnavailableFilter [2014-01-13 18:23] INFO 18:23:01,981 MicroScheduler - -> 88340 reads (24.89% of total) failing MappingQualityZeroFilter [2014-01-13 18:23] INFO 18:23:01,981 MicroScheduler - -> 0 reads (0.00% of total) failing NotPrimaryAlignmentFilter [2014-01-13 18:23] INFO 18:23:01,981 MicroScheduler - -> 0 reads (0.00% of total) failing Platform454Filter [2014-01-13 18:23] INFO 18:23:01,981 MicroScheduler - -> 0 reads (0.00% of total) failing UnmappedReadFilter [2014-01-13 18:23] WARN 18:23:02,394 RestStorageService - Error Response: PUT '/39e5dlGH383rPSqNlAD7zdNiUxfh4DCG.report.xml.gz' -- ResponseCode: 403, ResponseStatus: Forbidden, Request Headers: [Content-Length: 404, Content-MD5: xn3WNA4O48DwNNT3glpMMg==, Content-Type: application/octet-stream, x-amz-meta-md5-hash: c67dd6340e0ee3c0f034d4f7825a4c32, Date: Tue, 14 Jan 2014 01:23:01 GMT, Authorization: AWS AKIAI22FBBJ37D5X62OQ:wdBGfvBqCxgjWgTZWM3lV/wsg2c=, User-Agent: JetS3t/0.8.1 (Linux/2.6.32-279.el6.x86_64; amd64; en; JVM 1.7.0_45), Host: broad.gsa.gatk.run.reports.s3.amazonaws.com, Expect: 100-continue], Response Headers: [x-amz-request-id: EDB69C1D8A302F77, x-amz-id-2: EPmb5vWKPtYJrMOc/ZYBQ2vR9pyeei8dwzZ69lMWuE1fDJzTmaknukKNiUOKHr9+, Content-Type: application/xml, Transfer-Encoding: chunked, Date: Mon, 13 Jan 2014 18:33:02 GMT, Connection: close, Server: AmazonS3] [2014-01-13 18:23] WARN 18:23:02,565 RestStorageService - Adjusted time offset in response to RequestTimeTooSkewed error. Local machine and S3 server disagree on the time by approximately -24600 seconds. Retrying connection. [2014-01-13 18:23] INFO 18:23:02,963 GATKRunReport - Uploaded run statistics report to AWS S3 [2014-01-13 18:23] GATK: realign ('GL000192.1', 0, 547496) : NA12878 [2014-01-13 18:23] WARN 18:23:03,186 RestStorageService - Error Response: PUT '/uhNsZFShbauEWxsabtzBIozF0Sfgmaka.report.xml.gz' -- ResponseCode: 403, ResponseStatus: Forbidden, Request Headers: [Content-Length: 404, Content-MD5: byagQs2l3C5SsBqJLf+hMQ==, Content-Type: application/octet-stream, x-amz-meta-md5-hash: 6f26a042cda5dc2e52b01a892dffa131, Date: Tue, 14 Jan 2014 01:23:02 GMT, Authorization: AWS AKIAI22FBBJ37D5X62OQ:jNQIV1XWKUIPZroMG6KSDThcPY4=, User-Agent: JetS3t/0.8.1 (Linux/2.6.32-279.el6.x86_64; amd64; en; JVM 1.7.0_45), Host: broad.gsa.gatk.run.reports.s3.amazonaws.com, Expect: 100-continue], Response Headers: [x-amz-request-id: 3A243EB2E975B4ED, x-amz-id-2: FGaKRKymqKEcrRv+MXnmdCfxsvNnRmZiWUzO7kCRqgXLEr0PGKbhi5IbAfbgy27s, Content-Type: application/xml, Transfer-Encoding: chunked, Date: Mon, 13 Jan 2014 18:33:02 GMT, Connection: close, Server: AmazonS3] [2014-01-13 18:23] WARN 18:23:03,358 RestStorageService - Adjusted time offset in response to RequestTimeTooSkewed error. Local machine and S3 server disagree on the time by approximately -24600 seconds. Retrying connection. [2014-01-13 18:23] INFO 18:23:03,663 GATKRunReport - Uploaded run statistics report to AWS S3 [2014-01-13 18:23] GATK: realign ('GL000225.1', 0, 211173) : NA12878 [2014-01-13 18:23] INFO 18:23:05,825 HelpFormatter - -------------------------------------------------------------------------------- [2014-01-13 18:23] INFO 18:23:05,828 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.8-1-g932cd3a, Compiled 2013/12/06 16:47:15 [2014-01-13 18:23] INFO 18:23:05,828 HelpFormatter - Copyright (c) 2010 The Broad Institute [2014-01-13 18:23] INFO 18:23:05,828 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk [2014-01-13 18:23] INFO 18:23:05,832 HelpFormatter - Program Args: -T IndelRealigner -I /scratch/jpeden/test_whole_genome/work/bamprep/NA12878/GL000192.1/1_130327_iplat-sort-GL000192.1_0_547496-prep-prealign.bam -R /packages/bcbio/0.7.4/genomes/Hsapiens/GRCh37/seq/GRCh37.fa -targetIntervals /scratch/jpeden/test_whole_genome/work/bamprep/NA12878/GL000192.1/1_130327_iplat-sort-GL000192.1_0_547496-prep-prealign-realign.intervals -L GL000192.1:1-547496 -U LENIENT_VCF_PROCESSING --read_filter BadCigar --read_filter NotPrimaryAlignment -o /scratch/jpeden/test_whole_genome/work/bamprep/NA12878/GL000192.1/tx/tmp8RB6ko/1_130327_iplat-sort-GL000192.1_0_547496-prep.bam [2014-01-13 18:23] INFO 18:23:05,832 HelpFormatter - Date/Time: 2014/01/13 18:23:05 [2014-01-13 18:23] INFO 18:23:05,832 HelpFormatter - -------------------------------------------------------------------------------- [2014-01-13 18:23] INFO 18:23:05,832 HelpFormatter - -------------------------------------------------------------------------------- [2014-01-13 18:23] INFO 18:23:05,935 GenomeAnalysisEngine - Strictness is SILENT [2014-01-13 18:23] INFO 18:23:06,019 GenomeAnalysisEngine - Downsampling Settings: No downsampling [2014-01-13 18:23] INFO 18:23:06,027 SAMDataSource$SAMReaders - Initializing SAMRecords in serial [2014-01-13 18:23] INFO 18:23:06,046 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.02 [2014-01-13 18:23] INFO 18:23:06,060 IntervalUtils - Processing 547496 bp from intervals [2014-01-13 18:23] INFO 18:23:06,125 GenomeAnalysisEngine - Preparing for traversal over 1 BAM files [2014-01-13 18:23] INFO 18:23:06,151 GenomeAnalysisEngine - Done preparing for traversal [2014-01-13 18:23] INFO 18:23:06,152 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] [2014-01-13 18:23] INFO 18:23:06,154 ProgressMeter - Location processed.reads runtime per.1M.reads completed total.runtime remaining [2014-01-13 18:23] INFO 18:23:06,187 ReadShardBalancer$1 - Loading BAM index data [2014-01-13 18:23] INFO 18:23:06,197 ReadShardBalancer$1 - Done loading BAM index data [2014-01-13 18:23] INFO 18:23:06,596 HelpFormatter - -------------------------------------------------------------------------------- [2014-01-13 18:23] INFO 18:23:06,599 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.8-1-g932cd3a, Compiled 2013/12/06 16:47:15 [2014-01-13 18:23] INFO 18:23:06,599 HelpFormatter - Copyright (c) 2010 The Broad Institute [2014-01-13 18:23] INFO 18:23:06,600 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk [2014-01-13 18:23] INFO 18:23:06,605 HelpFormatter - Program Args: -T IndelRealigner -I /scratch/jpeden/test_whole_genome/work/bamprep/NA12878/GL000225.1/1_130327_iplat-sort-GL000225.1_0_211173-prep-prealign.bam -R /packages/bcbio/0.7.4/genomes/Hsapiens/GRCh37/seq/GRCh37.fa -targetIntervals /scratch/jpeden/test_whole_genome/work/bamprep/NA12878/GL000225.1/1_130327_iplat-sort-GL000225.1_0_211173-prep-prealign-realign.intervals -L GL000225.1:1-211173 -U LENIENT_VCF_PROCESSING --read_filter BadCigar --read_filter NotPrimaryAlignment -o /scratch/jpeden/test_whole_genome/work/bamprep/NA12878/GL000225.1/tx/tmpEZHVHz/1_130327_iplat-sort-GL000225.1_0_211173-prep.bam [2014-01-13 18:23] INFO 18:23:06,605 HelpFormatter - Date/Time: 2014/01/13 18:23:06 [2014-01-13 18:23] INFO 18:23:06,606 HelpFormatter - -------------------------------------------------------------------------------- [2014-01-13 18:23] INFO 18:23:06,606 HelpFormatter - -------------------------------------------------------------------------------- [2014-01-13 18:23] INFO 18:23:06,706 GenomeAnalysisEngine - Strictness is SILENT [2014-01-13 18:23] INFO 18:23:06,825 GenomeAnalysisEngine - Downsampling Settings: No downsampling [2014-01-13 18:23] INFO 18:23:06,834 SAMDataSource$SAMReaders - Initializing SAMRecords in serial [2014-01-13 18:23] INFO 18:23:06,859 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.02 [2014-01-13 18:23] INFO 18:23:06,873 IntervalUtils - Processing 211173 bp from intervals [2014-01-13 18:23] INFO 18:23:06,946 GenomeAnalysisEngine - Preparing for traversal over 1 BAM files [2014-01-13 18:23] INFO 18:23:06,967 GenomeAnalysisEngine - Done preparing for traversal [2014-01-13 18:23] INFO 18:23:06,968 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] [2014-01-13 18:23] INFO 18:23:06,970 ProgressMeter - Location processed.reads runtime per.1M.reads completed total.runtime remaining [2014-01-13 18:23] WARN 18:23:07,977 RestStorageService - Error Response: PUT '/VztvFKMOcNNmlbK5BLfUuPvNaZZnCiuu.report.xml.gz' -- ResponseCode: 403, ResponseStatus: Forbidden, Request Headers: [Content-Length: 902, Content-MD5: FxAtWdbAxkTYXd/beAHHBw==, Content-Type: application/octet-stream, x-amz-meta-md5-hash: 17102d59d6c0c644d85ddfdb7801c707, Date: Tue, 14 Jan 2014 01:23:07 GMT, Authorization: AWS AKIAI22FBBJ37D5X62OQ:OZzpLb8qAzkGybbu0mVIvPSZ6HA=, User-Agent: JetS3t/0.8.1 (Linux/2.6.32-279.el6.x86_64; amd64; en; JVM 1.7.0_45), Host: broad.gsa.gatk.run.reports.s3.amazonaws.com, Expect: 100-continue], Response Headers: [x-amz-request-id: 53F979583BA77CE3, x-amz-id-2: 5Xq6gvHxtE8LGA3hS7jIGNpGlDbiIRmuuKIQMr1Ufd1JEmsIferNbU2iaykVwqDI, Content-Type: application/xml, Transfer-Encoding: chunked, Date: Mon, 13 Jan 2014 18:33:08 GMT, Connection: close, Server: AmazonS3] [2014-01-13 18:23] WARN 18:23:08,151 RestStorageService - Adjusted time offset in response to RequestTimeTooSkewed error. Local machine and S3 server disagree on the time by approximately -24600 seconds. Retrying connection. [2014-01-13 18:23] INFO 18:23:08,605 GATKRunReport - Uploaded run statistics report to AWS S3 [2014-01-13 18:23] ##### ERROR ------------------------------------------------------------------------------------------ [2014-01-13 18:23] ##### ERROR A USER ERROR has occurred (version 2.8-1-g932cd3a): [2014-01-13 18:23] ##### ERROR [2014-01-13 18:23] ##### ERROR This means that one or more arguments or inputs in your command are incorrect. [2014-01-13 18:23] ##### ERROR The error message below tells you what is the problem. [2014-01-13 18:23] ##### ERROR [2014-01-13 18:23] ##### ERROR If the problem is an invalid argument, please check the online documentation guide [2014-01-13 18:23] ##### ERROR (or rerun your command with --help) to view allowable command-line arguments for this tool. [2014-01-13 18:23] ##### ERROR [2014-01-13 18:23] ##### ERROR Visit our website and forum for extensive documentation and answers to [2014-01-13 18:23] ##### ERROR commonly asked questions http://www.broadinstitute.org/gatk [2014-01-13 18:23] ##### ERROR [2014-01-13 18:23] ##### ERROR Please do NOT post this error to the GATK forum unless you have really tried to fix it yourself. [2014-01-13 18:23] ##### ERROR [2014-01-13 18:23] ##### ERROR MESSAGE: Couldn't read file /scratch/jpeden/test_whole_genome/work/bamprep/NA12878/GL000225.1/1_130327_iplat-sort-GL000225.1_0_211173-prep-prealign-realign.intervals because The interval file does not exist. [2014-01-13 18:23] ##### ERROR ------------------------------------------------------------------------------------------ [2014-01-13 18:23] Uncaught exception occurred Traceback (most recent call last): File "/packages/bcbio/0.7.4/anaconda/lib/python2.7/site-packages/bcbio/provenance/do.py", line 22, in run _do_run(cmd, checks) File "/packages/bcbio/0.7.4/anaconda/lib/python2.7/site-packages/bcbio/provenance/do.py", line 113, in _do_run raise subprocess.CalledProcessError(exitcode, error_msg) CalledProcessError: Command 'set -o pipefail; java -Xms750m -Xmx2500m -Djava.io.tmpdir=/scratch/jpeden/test_whole_genome/work/tmp/tmpRFN8_r -jar /packages/bcbio/0.7.4/share/java/gatk/GenomeAnalysisTK.jar -T IndelRealigner -I /scratch/jpeden/test_whole_genome/work/bamprep/NA12878/GL000225.1/1_130327_iplat-sort-GL000225.1_0_211173-prep-prealign.bam -R /packages/bcbio/0.7.4/genomes/Hsapiens/GRCh37/seq/GRCh37.fa -targetIntervals /scratch/jpeden/test_whole_genome/work/bamprep/NA12878/GL000225.1/1_130327_iplat-sort-GL000225.1_0_211173-prep-prealign-realign.intervals -L GL000225.1:1-211173 -U LENIENT_VCF_PROCESSING --read_filter BadCigar --read_filter NotPrimaryAlignment -o /scratch/jpeden/test_whole_genome/work/bamprep/NA12878/GL000225.1/tx/tmpEZHVHz/1_130327_iplat-sort-GL000225.1_0_211173-prep.bam INFO 18:23:06,596 HelpFormatter - -------------------------------------------------------------------------------- INFO 18:23:06,599 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.8-1-g932cd3a, Compiled 2013/12/06 16:47:15 INFO 18:23:06,599 HelpFormatter - Copyright (c) 2010 The Broad Institute INFO 18:23:06,600 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk INFO 18:23:06,605 HelpFormatter - Program Args: -T IndelRealigner -I /scratch/jpeden/test_whole_genome/work/bamprep/NA12878/GL000225.1/1_130327_iplat-sort-GL000225.1_0_211173-prep-prealign.bam -R /packages/bcbio/0.7.4/genomes/Hsapiens/GRCh37/seq/GRCh37.fa -targetIntervals /scratch/jpeden/test_whole_genome/work/bamprep/NA12878/GL000225.1/1_130327_iplat-sort-GL000225.1_0_211173-prep-prealign-realign.intervals -L GL000225.1:1-211173 -U LENIENT_VCF_PROCESSING --read_filter BadCigar --read_filter NotPrimaryAlignment -o /scratch/jpeden/test_whole_genome/work/bamprep/NA12878/GL000225.1/tx/tmpEZHVHz/1_130327_iplat-sort-GL000225.1_0_211173-prep.bam INFO 18:23:06,605 HelpFormatter - Date/Time: 2014/01/13 18:23:06 INFO 18:23:06,606 HelpFormatter - -------------------------------------------------------------------------------- INFO 18:23:06,606 HelpFormatter - -------------------------------------------------------------------------------- INFO 18:23:06,706 GenomeAnalysisEngine - Strictness is SILENT INFO 18:23:06,825 GenomeAnalysisEngine - Downsampling Settings: No downsampling INFO 18:23:06,834 SAMDataSource$SAMReaders - Initializing SAMRecords in serial INFO 18:23:06,859 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.02 INFO 18:23:06,873 IntervalUtils - Processing 211173 bp from intervals INFO 18:23:06,946 GenomeAnalysisEngine - Preparing for traversal over 1 BAM files INFO 18:23:06,967 GenomeAnalysisEngine - Done preparing for traversal INFO 18:23:06,968 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] INFO 18:23:06,970 ProgressMeter - Location processed.reads runtime per.1M.reads completed total.runtime remaining WARN 18:23:07,977 RestStorageService - Error Response: PUT '/VztvFKMOcNNmlbK5BLfUuPvNaZZnCiuu.report.xml.gz' -- ResponseCode: 403, ResponseStatus: Forbidden, Request Headers: [Content-Length: 902, Content-MD5: FxAtWdbAxkTYXd/beAHHBw==, Content-Type: application/octet-stream, x-amz-meta-md5-hash: 17102d59d6c0c644d85ddfdb7801c707, Date: Tue, 14 Jan 2014 01:23:07 GMT, Authorization: AWS AKIAI22FBBJ37D5X62OQ:OZzpLb8qAzkGybbu0mVIvPSZ6HA=, User-Agent: JetS3t/0.8.1 (Linux/2.6.32-279.el6.x86_64; amd64; en; JVM 1.7.0_45), Host: broad.gsa.gatk.run.reports.s3.amazonaws.com, Expect: 100-continue], Response Headers: [x-amz-request-id: 53F979583BA77CE3, x-amz-id-2: 5Xq6gvHxtE8LGA3hS7jIGNpGlDbiIRmuuKIQMr1Ufd1JEmsIferNbU2iaykVwqDI, Content-Type: application/xml, Transfer-Encoding: chunked, Date: Mon, 13 Jan 2014 18:33:08 GMT, Connection: close, Server: AmazonS3] WARN 18:23:08,151 RestStorageService - Adjusted time offset in response to RequestTimeTooSkewed error. Local machine and S3 server disagree on the time by approximately -24600 seconds. Retrying connection. INFO 18:23:08,605 GATKRunReport - Uploaded run statistics report to AWS S3

ERROR ------------------------------------------------------------------------------------------
ERROR A USER ERROR has occurred (version 2.8-1-g932cd3a):
ERROR
ERROR This means that one or more arguments or inputs in your command are incorrect.
ERROR The error message below tells you what is the problem.
ERROR
ERROR If the problem is an invalid argument, please check the online documentation guide
ERROR (or rerun your command with --help) to view allowable command-line arguments for this tool.
ERROR
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR Please do NOT post this error to the GATK forum unless you have really tried to fix it yourself.
ERROR
ERROR MESSAGE: Couldn't read file /scratch/jpeden/test_whole_genome/work/bamprep/NA12878/GL000225.1/1_130327_iplat-sort-GL000225.1_0_211173-prep-prealign-realign.intervals because The interval file does not exist.
ERROR ------------------------------------------------------------------------------------------

' returned non-zero exit status 1 An unexpected error occurred while tokenizing input The following traceback may be corrupted or invalid The error message is: ('EOF in multi-line statement', (8, 0))

[2014-01-13 18:23] INFO 18:23:09,513 ProgressMeter - done 7.34e+04 3.0 s 45.0 s 100.0% 3.0 s 0.0 s [2014-01-13 18:23] INFO 18:23:09,514 ProgressMeter - Total runtime 3.36 secs, 0.06 min, 0.00 hours [2014-01-13 18:23] INFO 18:23:09,592 MicroScheduler - 0 reads were filtered out during the traversal out of approximately 73404 total reads (0.00%) [2014-01-13 18:23] INFO 18:23:09,592 MicroScheduler - -> 0 reads (0.00% of total) failing BadCigarFilter [2014-01-13 18:23] INFO 18:23:09,592 MicroScheduler - -> 0 reads (0.00% of total) failing MalformedReadFilter [2014-01-13 18:23] INFO 18:23:09,592 MicroScheduler - -> 0 reads (0.00% of total) failing NotPrimaryAlignmentFilter [2014-01-13 18:23] WARN 18:23:10,459 RestStorageService - Error Response: PUT '/icGGHQKXaAH9sfJUyWMQXQt7lyZQqt8Z.report.xml.gz' -- ResponseCode: 403, ResponseStatus: Forbidden, Request Headers: [Content-Length: 398, Content-MD5: 3gvv+SghZfv6vBiNnKNvnw==, Content-Type: application/octet-stream, x-amz-meta-md5-hash: de0beff9282165fbfabc188d9ca36f9f, Date: Tue, 14 Jan 2014 01:23:09 GMT, Authorization: AWS AKIAI22FBBJ37D5X62OQ:YyW3u8iuFdH2Oq2UdCKluW2yw2Q=, User-Agent: JetS3t/0.8.1 (Linux/2.6.32-279.el6.x86_64; amd64; en; JVM 1.7.0_45), Host: broad.gsa.gatk.run.reports.s3.amazonaws.com, Expect: 100-continue], Response Headers: [x-amz-request-id: 1E2F78FC289F0549, x-amz-id-2: VKu3oeiWUjpqp/X8JacScPVwFlw5csMg62N9+cHrxV5ZFWMSvM5Zh9YhQLCLSA2d, Content-Type: application/xml, Transfer-Encoding: chunked, Date: Mon, 13 Jan 2014 18:33:10 GMT, Connection: close, Server: AmazonS3] [2014-01-13 18:23] WARN 18:23:10,627 RestStorageService - Adjusted time offset in response to RequestTimeTooSkewed error. Local machine and S3 server disagree on the time by approximately -24600 seconds. Retrying connection. [2014-01-13 18:23] INFO 18:23:10,935 GATKRunReport - Uploaded run statistics report to AWS S3 [2014-01-13 18:23] Index BAM file: 1_130327_iplat-sort-GL000192.1_0_547496-prep.bam [2014-01-13 18:23] INFO 18:23:11,451 ProgressMeter - done 1.62e+05 14.0 s 92.0 s 100.0% 14.0 s 0.0 s [2014-01-13 18:23] INFO 18:23:11,451 ProgressMeter - Total runtime 14.93 secs, 0.25 min, 0.00 hours [2014-01-13 18:23] INFO 18:23:11,451 MicroScheduler - 480174 reads were filtered out during the traversal out of approximately 1399604 total reads (34.31%) [2014-01-13 18:23] INFO 18:23:11,452 MicroScheduler - -> 0 reads (0.00% of total) failing BadCigarFilter [2014-01-13 18:23] INFO 18:23:11,452 MicroScheduler - -> 7077 reads (0.51% of total) failing BadMateFilter [2014-01-13 18:23] INFO 18:23:11,452 MicroScheduler - -> 41031 reads (2.93% of total) failing DuplicateReadFilter [2014-01-13 18:23] INFO 18:23:11,452 MicroScheduler - -> 0 reads (0.00% of total) failing FailsVendorQualityCheckFilter [2014-01-13 18:23] INFO 18:23:11,453 MicroScheduler - -> 0 reads (0.00% of total) failing MalformedReadFilter [2014-01-13 18:23] INFO 18:23:11,453 MicroScheduler - -> 0 reads (0.00% of total) failing MappingQualityUnavailableFilter [2014-01-13 18:23] INFO 18:23:11,453 MicroScheduler - -> 432066 reads (30.87% of total) failing MappingQualityZeroFilter [2014-01-13 18:23] INFO 18:23:11,453 MicroScheduler - -> 0 reads (0.00% of total) failing NotPrimaryAlignmentFilter [2014-01-13 18:23] INFO 18:23:11,453 MicroScheduler - -> 0 reads (0.00% of total) failing Platform454Filter [2014-01-13 18:23] INFO 18:23:11,454 MicroScheduler - -> 0 reads (0.00% of total) failing UnmappedReadFilter [2014-01-13 18:23] WARN 18:23:12,647 RestStorageService - Error Response: PUT '/SsgCs1i6iCRIbU9j1G7wWRozKUOIdFIm.report.xml.gz' -- ResponseCode: 403, ResponseStatus: Forbidden, Request Headers: [Content-Length: 402, Content-MD5: HfYOfaQDmTt2IVFdXxoCrA==, Content-Type: application/octet-stream, x-amz-meta-md5-hash: 1df60e7da403993b7621515d5f1a02ac, Date: Tue, 14 Jan 2014 01:23:12 GMT, Authorization: AWS AKIAI22FBBJ37D5X62OQ:O0XRGoVZJFldDskdKjXWqAe/HAo=, User-Agent: JetS3t/0.8.1 (Linux/2.6.32-279.el6.x86_64; amd64; en; JVM 1.7.0_45), Host: broad.gsa.gatk.run.reports.s3.amazonaws.com, Expect: 100-continue], Response Headers: [x-amz-request-id: 0158C3881E1585A5, x-amz-id-2: KeTw5AIVxlk/vSl2fxlUDWDr0BVQbGY3iIW/BRGdWHlPFpCPkvHTrDbHFsw9gZ7D, Content-Type: application/xml, Transfer-Encoding: chunked, Date: Mon, 13 Jan 2014 18:33:12 GMT, Connection: close, Server: AmazonS3] [2014-01-13 18:23] INFO 18:23:12,797 ProgressMeter - GL000199.1:34709 0.00e+00 60.0 s 99.3 w 20.4% 4.9 m 3.9 m [2014-01-13 18:23] WARN 18:23:12,820 RestStorageService - Adjusted time offset in response to RequestTimeTooSkewed error. Local machine and S3 server disagree on the time by approximately -24599 seconds. Retrying connection. [2014-01-13 18:23] INFO 18:23:13,298 GATKRunReport - Uploaded run statistics report to AWS S3 [2014-01-13 18:23] GATK: realign ('GL000220.1', 0, 161802) : NA12878 [2014-01-13 18:23] INFO 18:23:16,074 HelpFormatter - -------------------------------------------------------------------------------- [2014-01-13 18:23] INFO 18:23:16,076 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.8-1-g932cd3a, Compiled 2013/12/06 16:47:15 [2014-01-13 18:23] INFO 18:23:16,076 HelpFormatter - Copyright (c) 2010 The Broad Institute [2014-01-13 18:23] INFO 18:23:16,076 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk [2014-01-13 18:23] INFO 18:23:16,080 HelpFormatter - Program Args: -T IndelRealigner -I /scratch/jpeden/test_whole_genome/work/bamprep/NA12878/GL000220.1/1_130327_iplat-sort-GL000220.1_0_161802-prep-prealign.bam -R /packages/bcbio/0.7.4/genomes/Hsapiens/GRCh37/seq/GRCh37.fa -targetIntervals /scratch/jpeden/test_whole_genome/work/bamprep/NA12878/GL000220.1/1_130327_iplat-sort-GL000220.1_0_161802-prep-prealign-realign.intervals -L GL000220.1:1-161802 -U LENIENT_VCF_PROCESSING --read_filter BadCigar --read_filter NotPrimaryAlignment -o /scratch/jpeden/test_whole_genome/work/bamprep/NA12878/GL000220.1/tx/tmpaPXXEH/1_130327_iplat-sort-GL000220.1_0_161802-prep.bam [2014-01-13 18:23] INFO 18:23:16,081 HelpFormatter - Date/Time: 2014/01/13 18:23:16 [2014-01-13 18:23] INFO 18:23:16,081 HelpFormatter - -------------------------------------------------------------------------------- [2014-01-13 18:23] INFO 18:23:16,081 HelpFormatter - -------------------------------------------------------------------------------- [2014-01-13 18:23] INFO 18:23:16,174 GenomeAnalysisEngine - Strictness is SILENT [2014-01-13 18:23] INFO 18:23:16,243 GenomeAnalysisEngine - Downsampling Settings: No downsampling [2014-01-13 18:23] INFO 18:23:16,251 SAMDataSource$SAMReaders - Initializing SAMRecords in serial [2014-01-13 18:23] INFO 18:23:16,271 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.02 [2014-01-13 18:23] INFO 18:23:16,284 IntervalUtils - Processing 161802 bp from intervals [2014-01-13 18:23] INFO 18:23:16,344 GenomeAnalysisEngine - Preparing for traversal over 1 BAM files [2014-01-13 18:23] INFO 18:23:16,365 GenomeAnalysisEngine - Done preparing for traversal [2014-01-13 18:23] INFO 18:23:16,365 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] [2014-01-13 18:23] INFO 18:23:16,366 ProgressMeter - Location processed.reads runtime per.1M.reads completed total.runtime remaining [2014-01-13 18:23] WARN 18:23:17,237 RestStorageService - Error Response: PUT '/RXXlgQ8NSXRyJZzH0dxwjE35AEjoMJHf.report.xml.gz' -- ResponseCode: 403, ResponseStatus: Forbidden, Request Headers: [Content-Length: 902, Content-MD5: AmfJaZ8nicUX5P4bViGM7Q==, Content-Type: application/octet-stream, x-amz-meta-md5-hash: 0267c9699f2789c517e4fe1b56218ced, Date: Tue, 14 Jan 2014 01:23:16 GMT, Authorization: AWS AKIAI22FBBJ37D5X62OQ:Ka4w8QGVkRPQYeL0smdCJT3i+UY=, User-Agent: JetS3t/0.8.1 (Linux/2.6.32-279.el6.x86_64; amd64; en; JVM 1.7.0_45), Host: broad.gsa.gatk.run.reports.s3.amazonaws.com, Expect: 100-continue], Response Headers: [x-amz-request-id: B3DBD1C15ACEB0E8, x-amz-id-2: MvMV1tuq8sb4jArQZ9aRYM4ZeND7Q1d6ACs2n1H7hZte6IylNtos6TTRtXSk94Ny, Content-Type: application/xml, Transfer-Encoding: chunked, Date: Mon, 13 Jan 2014 18:33:17 GMT, Connection: close, Server: AmazonS3] [2014-01-13 18:23] WARN 18:23:17,408 RestStorageService - Adjusted time offset in response to RequestTimeTooSkewed error. Local machine and S3 server disagree on the time by approximately -24600 seconds. Retrying connection. [2014-01-13 18:23] INFO 18:23:17,898 GATKRunReport - Uploaded run statistics report to AWS S3 [2014-01-13 18:23] ##### ERROR ------------------------------------------------------------------------------------------ [2014-01-13 18:23] ##### ERROR A USER ERROR has occurred (version 2.8-1-g932cd3a): [2014-01-13 18:23] ##### ERROR [2014-01-13 18:23] ##### ERROR This means that one or more arguments or inputs in your command are incorrect. [2014-01-13 18:23] ##### ERROR The error message below tells you what is the problem. [2014-01-13 18:23] ##### ERROR [2014-01-13 18:23] ##### ERROR If the problem is an invalid argument, please check the online documentation guide [2014-01-13 18:23] ##### ERROR (or rerun your command with --help) to view allowable command-line arguments for this tool. [2014-01-13 18:23] ##### ERROR [2014-01-13 18:23] ##### ERROR Visit our website and forum for extensive documentation and answers to [2014-01-13 18:23] ##### ERROR commonly asked questions http://www.broadinstitute.org/gatk [2014-01-13 18:23] ##### ERROR [2014-01-13 18:23] ##### ERROR Please do NOT post this error to the GATK forum unless you have really tried to fix it yourself. [2014-01-13 18:23] ##### ERROR [2014-01-13 18:23] ##### ERROR MESSAGE: Couldn't read file /scratch/jpeden/test_whole_genome/work/bamprep/NA12878/GL000220.1/1_130327_iplat-sort-GL000220.1_0_161802-prep-prealign-realign.intervals because The interval file does not exist. [2014-01-13 18:23] ##### ERROR ------------------------------------------------------------------------------------------ [2014-01-13 18:23] Uncaught exception occurred Traceback (most recent call last): File "/packages/bcbio/0.7.4/anaconda/lib/python2.7/site-packages/bcbio/provenance/do.py", line 22, in run _do_run(cmd, checks) File "/packages/bcbio/0.7.4/anaconda/lib/python2.7/site-packages/bcbio/provenance/do.py", line 113, in _do_run raise subprocess.CalledProcessError(exitcode, error_msg) CalledProcessError: Command 'set -o pipefail; java -Xms750m -Xmx2500m -Djava.io.tmpdir=/scratch/jpeden/test_whole_genome/work/tmp/tmpHLRKUO -jar /packages/bcbio/0.7.4/share/java/gatk/GenomeAnalysisTK.jar -T IndelRealigner -I /scratch/jpeden/test_whole_genome/work/bamprep/NA12878/GL000220.1/1_130327_iplat-sort-GL000220.1_0_161802-prep-prealign.bam -R /packages/bcbio/0.7.4/genomes/Hsapiens/GRCh37/seq/GRCh37.fa -targetIntervals /scratch/jpeden/test_whole_genome/work/bamprep/NA12878/GL000220.1/1_130327_iplat-sort-GL000220.1_0_161802-prep-prealign-realign.intervals -L GL000220.1:1-161802 -U LENIENT_VCF_PROCESSING --read_filter BadCigar --read_filter NotPrimaryAlignment -o /scratch/jpeden/test_whole_genome/work/bamprep/NA12878/GL000220.1/tx/tmpaPXXEH/1_130327_iplat-sort-GL000220.1_0_161802-prep.bam INFO 18:23:16,074 HelpFormatter - -------------------------------------------------------------------------------- INFO 18:23:16,076 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.8-1-g932cd3a, Compiled 2013/12/06 16:47:15 INFO 18:23:16,076 HelpFormatter - Copyright (c) 2010 The Broad Institute INFO 18:23:16,076 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk INFO 18:23:16,080 HelpFormatter - Program Args: -T IndelRealigner -I /scratch/jpeden/test_whole_genome/work/bamprep/NA12878/GL000220.1/1_130327_iplat-sort-GL000220.1_0_161802-prep-prealign.bam -R /packages/bcbio/0.7.4/genomes/Hsapiens/GRCh37/seq/GRCh37.fa -targetIntervals /scratch/jpeden/test_whole_genome/work/bamprep/NA12878/GL000220.1/1_130327_iplat-sort-GL000220.1_0_161802-prep-prealign-realign.intervals -L GL000220.1:1-161802 -U LENIENT_VCF_PROCESSING --read_filter BadCigar --read_filter NotPrimaryAlignment -o /scratch/jpeden/test_whole_genome/work/bamprep/NA12878/GL000220.1/tx/tmpaPXXEH/1_130327_iplat-sort-GL000220.1_0_161802-prep.bam INFO 18:23:16,081 HelpFormatter - Date/Time: 2014/01/13 18:23:16 INFO 18:23:16,081 HelpFormatter - -------------------------------------------------------------------------------- INFO 18:23:16,081 HelpFormatter - -------------------------------------------------------------------------------- INFO 18:23:16,174 GenomeAnalysisEngine - Strictness is SILENT INFO 18:23:16,243 GenomeAnalysisEngine - Downsampling Settings: No downsampling INFO 18:23:16,251 SAMDataSource$SAMReaders - Initializing SAMRecords in serial INFO 18:23:16,271 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.02 INFO 18:23:16,284 IntervalUtils - Processing 161802 bp from intervals INFO 18:23:16,344 GenomeAnalysisEngine - Preparing for traversal over 1 BAM files INFO 18:23:16,365 GenomeAnalysisEngine - Done preparing for traversal INFO 18:23:16,365 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] INFO 18:23:16,366 ProgressMeter - Location processed.reads runtime per.1M.reads completed total.runtime remaining WARN 18:23:17,237 RestStorageService - Error Response: PUT '/RXXlgQ8NSXRyJZzH0dxwjE35AEjoMJHf.report.xml.gz' -- ResponseCode: 403, ResponseStatus: Forbidden, Request Headers: [Content-Length: 902, Content-MD5: AmfJaZ8nicUX5P4bViGM7Q==, Content-Type: application/octet-stream, x-amz-meta-md5-hash: 0267c9699f2789c517e4fe1b56218ced, Date: Tue, 14 Jan 2014 01:23:16 GMT, Authorization: AWS AKIAI22FBBJ37D5X62OQ:Ka4w8QGVkRPQYeL0smdCJT3i+UY=, User-Agent: JetS3t/0.8.1 (Linux/2.6.32-279.el6.x86_64; amd64; en; JVM 1.7.0_45), Host: broad.gsa.gatk.run.reports.s3.amazonaws.com, Expect: 100-continue], Response Headers: [x-amz-request-id: B3DBD1C15ACEB0E8, x-amz-id-2: MvMV1tuq8sb4jArQZ9aRYM4ZeND7Q1d6ACs2n1H7hZte6IylNtos6TTRtXSk94Ny, Content-Type: application/xml, Transfer-Encoding: chunked, Date: Mon, 13 Jan 2014 18:33:17 GMT, Connection: close, Server: AmazonS3] WARN 18:23:17,408 RestStorageService - Adjusted time offset in response to RequestTimeTooSkewed error. Local machine and S3 server disagree on the time by approximately -24600 seconds. Retrying connection. INFO 18:23:17,898 GATKRunReport - Uploaded run statistics report to AWS S3

ERROR ------------------------------------------------------------------------------------------
ERROR A USER ERROR has occurred (version 2.8-1-g932cd3a):
ERROR
ERROR This means that one or more arguments or inputs in your command are incorrect.
ERROR The error message below tells you what is the problem.
ERROR
ERROR If the problem is an invalid argument, please check the online documentation guide
ERROR (or rerun your command with --help) to view allowable command-line arguments for this tool.
ERROR
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR Please do NOT post this error to the GATK forum unless you have really tried to fix it yourself.
ERROR
ERROR MESSAGE: Couldn't read file /scratch/jpeden/test_whole_genome/work/bamprep/NA12878/GL000220.1/1_130327_iplat-sort-GL000220.1_0_161802-prep-prealign-realign.intervals because The interval file does not exist.
ERROR ------------------------------------------------------------------------------------------

' returned non-zero exit status 1 An unexpected error occurred while tokenizing input The following traceback may be corrupted or invalid The error message is: ('EOF in multi-line statement', (8, 0))

[2014-01-13 18:23] INFO 18:23:42,798 ProgressMeter - GL000199.1:57471 1.00e+05 90.0 s 15.0 m 33.8% 4.4 m 2.9 m [2014-01-13 18:24] INFO 18:24:12,799 ProgressMeter - GL000199.1:108317 5.00e+05 120.0 s 4.0 m 63.8% 3.1 m 68.0 s [2014-01-13 18:24] INFO 18:24:42,921 ProgressMeter - GL000199.1:154472 7.00e+05 2.5 m 3.6 m 90.9% 2.7 m 14.0 s [2014-01-13 18:24] INFO 18:24:52,549 ProgressMeter - done 8.91e+05 2.7 m 3.0 m 100.0% 2.7 m 0.0 s [2014-01-13 18:24] INFO 18:24:52,549 ProgressMeter - Total runtime 159.81 secs, 2.66 min, 0.04 hours [2014-01-13 18:24] INFO 18:24:52,626 MicroScheduler - 0 reads were filtered out during the traversal out of approximately 891168 total reads (0.00%) [2014-01-13 18:24] INFO 18:24:52,626 MicroScheduler - -> 0 reads (0.00% of total) failing BadCigarFilter [2014-01-13 18:24] INFO 18:24:52,627 MicroScheduler - -> 0 reads (0.00% of total) failing MalformedReadFilter [2014-01-13 18:24] INFO 18:24:52,627 MicroScheduler - -> 0 reads (0.00% of total) failing NotPrimaryAlignmentFilter [2014-01-13 18:24] WARN 18:24:53,575 RestStorageService - Error Response: PUT '/iqQvsKpPNw4CnvplNt0FY2CcItGVfzYS.report.xml.gz' -- ResponseCode: 403, ResponseStatus: Forbidden, Request Headers: [Content-Length: 401, Content-MD5: EcqbTxXk4p3TsbtZpNFvLw==, Content-Type: application/octet-stream, x-amz-meta-md5-hash: 11ca9b4f15e4e29dd3b1bb59a4d16f2f, Date: Tue, 14 Jan 2014 01:24:52 GMT, Authorization: AWS AKIAI22FBBJ37D5X62OQ:9DneksJJ8YRbZxDDcWh2KAJRFsY=, User-Agent: JetS3t/0.8.1 (Linux/2.6.32-279.el6.x86_64; amd64; en; JVM 1.7.0_45), Host: broad.gsa.gatk.run.reports.s3.amazonaws.com, Expect: 100-continue], Response Headers: [x-amz-request-id: 9B77F9B9BF1E22B6, x-amz-id-2: 7P0jlGT21lreOzaI5pBrc/y5K5pgw8FEco6fXFu11aBJxYFDjtp681//ynKanriu, Content-Type: application/xml, Transfer-Encoding: chunked, Date: Mon, 13 Jan 2014 18:34:53 GMT, Connection: close, Server: AmazonS3] [2014-01-13 18:24] WARN 18:24:53,810 RestStorageService - Adjusted time offset in response to RequestTimeTooSkewed error. Local machine and S3 server disagree on the time by approximately -24599 seconds. Retrying connection. [2014-01-13 18:24] INFO 18:24:54,321 GATKRunReport - Uploaded run statistics report to AWS S3 [2014-01-13 18:24] Index BAM file: 1_130327_iplat-sort-GL000199.1_0_169874-prep.bam Traceback (most recent call last): File "/packages/bcbio/0.7.4/bin/bcbio_nextgen.py", line 54, in main(kwargs) File "/packages/bcbio/0.7.4/bin/bcbio_nextgen.py", line 38, in main run_main(kwargs) File "/packages/bcbio/0.7.4/anaconda/lib/python2.7/site-packages/bcbio/pipeline/main.py", line 44, in run_main fc_dir, run_info_yaml) File "/packages/bcbio/0.7.4/anaconda/lib/python2.7/site-packages/bcbio/pipeline/main.py", line 92, in _run_toplevel for xs in pipeline.run(config, config_file, run_parallel, parallel, dirs, pipeline_items): File "/packages/bcbio/0.7.4/anaconda/lib/python2.7/site-packages/bcbio/pipeline/main.py", line 315, in run samples = region.parallel_prep_region(samples, regions, run_parallel) File "/packages/bcbio/0.7.4/anaconda/lib/python2.7/site-packages/bcbio/pipeline/region.py", line 69, in parallel_prep_region "piped_bamprep", None, file_key, ["config"]) File "/packages/bcbio/0.7.4/anaconda/lib/python2.7/site-packages/bcbio/distributed/split.py", line 77, in parallel_split_combine split_output = parallel_fn(parallel_name, split_args) File "/packages/bcbio/0.7.4/anaconda/lib/python2.7/site-packages/bcbio/distributed/messaging.py", line 51, in run_parallel for data in joblib.Parallel(jobr.num_jobs)(joblib.delayed(fn)(x) for x in items): File "/packages/bcbio/0.7.4/anaconda/lib/python2.7/site-packages/joblib/parallel.py", line 519, in call self.retrieve() File "/packages/bcbio/0.7.4/anaconda/lib/python2.7/site-packages/joblib/parallel.py", line 450, in retrieve raise exception_type(report) TypeError: init() takes at least 3 arguments (2 given)

chapmanb commented 10 years ago

Thanks for the problem report. This is not expected and I re-ran a clean test case with GATK 2.8.1 locally and could not reproduce it here, so we'll have to dig a little bit further. The key issue looks like this file is never generated by GATK's RealignerTargetCreator:

bamprep/NA12878/GL000225.1/1_130327_iplat-sort-GL000225.1_0_211173-prep-prealign-realign.intervals

I'm not sure why you wouldn't see an error earlier when creating this file. If you rm -rf bamprep/NA12878/GL000225.1/ and then run non-distributed with a single core (bcbio_nextgen.py ../config/NA12878-illumina.yaml -n 1) does it fail?

If you want to run smaller tests, the test suite runs fairly quickly on small datasets:

https://bcbio-nextgen.readthedocs.org/en/latest/contents/testing.html#test-suite

Sorry to not have a definite answer but hopefully one of these helps isolate the issue. https://bcbio-nextgen.readthedocs.org/en/latest/contents/testing.html#test-suite