broadinstitute / picard

A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.
https://broadinstitute.github.io/picard/
MIT License
975 stars 369 forks source link

CollectGcBiasMetrics error in current picard version #288

Closed tpshea2 closed 9 years ago

tpshea2 commented 9 years ago

Using the current 1.878 picard CollectGcBiasMetrics (picard.jar CollectGcBiasMetrics) I get the error reported below. This occurs with Java 1.6.0 and Java 1.8. When I use an older picard (1.78, directly using CollectGcBiasMetrics.jar) the program runs successfully, indicating that the aligned BAM file is good enough at least for an older version.

Is this a bug in the current version or is there some new BAM-specific input requirement by this particular tool? I am using other current picard tools on the same BAM and they do not report this error.

[Mon Sep 14 14:46:43 EDT 2015] picard.analysis.CollectGcBiasMetrics CHART_OUTPUT=read_qc/metrics_data/Smallfrag-PCRfree-Ecoli-dmc.ref_aligned.bam.gc_bias.pdf INPUT=Smallfrag-PCRfree-Ecoli-dmc.ref_aligned.bam OUTPUT=read_qc/metrics_data/Smallfrag-PCRfree-Ecoli-dmc.ref_aligned.bam.gc_bias.metrics VALIDATION_STRINGENCY=SILENT REFERENCE_SEQUENCE=/btl/projects/SSF/Development/assembly/illumina_assembly_data/andrea_ssf_training/read_analysis/aligned_bams/Escherichia_coli_K12_MG1655.fasta WINDOW_SIZE=100 MINIMUM_GENOME_FRACTION=1.0E-5 IS_BISULFITE_SEQUENCED=false METRIC_ACCUMULATION_LEVEL=[ALL_READS] ASSUME_SORTED=true STOP_AFTER=0 VERBOSITY=INFO QUIET=false COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json [Mon Sep 14 14:46:43 EDT 2015] Executing as tshea@dunkel on Linux 2.6.39 amd64; Java HotSpot(TM) 64-Bit Server VM 1.6.0_35-b10; Picard version: 1.878(89618e408692ff6288c7c880658f32f16fcbec53_1441135673) IntelDeflater INFO 2015-09-14 14:46:48 SinglePassSamProgram Processed 1,000,000 records. Elapsed time: 00:00:05s. Time for last 1,000,000: 5s. Last read position: ecoli_K12_MG12655:597,747 INFO 2015-09-14 14:46:53 SinglePassSamProgram Processed 2,000,000 records. Elapsed time: 00:00:09s. Time for last 1,000,000: 4s. Last read position: ecoli_K12_MG12655:1,189,549 INFO 2015-09-14 14:46:57 SinglePassSamProgram Processed 3,000,000 records. Elapsed time: 00:00:13s. Time for last 1,000,000: 4s. Last read position: ecoli_K12_MG12655:1,750,372 INFO 2015-09-14 14:47:02 SinglePassSamProgram Processed 4,000,000 records. Elapsed time: 00:00:18s. Time for last 1,000,000: 4s. Last read position: ecoli_K12_MG12655:2,326,249 INFO 2015-09-14 14:47:06 SinglePassSamProgram Processed 5,000,000 records. Elapsed time: 00:00:22s. Time for last 1,000,000: 4s. Last read position: ecoli_K12_MG12655:2,930,170 INFO 2015-09-14 14:47:10 SinglePassSamProgram Processed 6,000,000 records. Elapsed time: 00:00:26s. Time for last 1,000,000: 4s. Last read position: ecoli_K12_MG12655:3,535,153 INFO 2015-09-14 14:47:14 SinglePassSamProgram Processed 7,000,000 records. Elapsed time: 00:00:31s. Time for last 1,000,000: 4s. Last read position: ecoli_K12MG12655:4,135,983 INFO 2015-09-14 14:47:19 SinglePassSamProgram Processed 8,000,000 records. Elapsed time: 00:00:35s. Time for last 1,000,000: 4s. Last read position: /_ [Mon Sep 14 14:47:21 EDT 2015] picard.analysis.CollectGcBiasMetrics done. Elapsed time: 0.63 minutes. Runtime.totalMemory()=2123956224 To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp Exception in thread "main" java.lang.NullPointerException at java.io.FileOutputStream.(FileOutputStream.java:186) at java.io.FileOutputStream.(FileOutputStream.java:145) at java.io.FileWriter.(FileWriter.java:73) at htsjdk.samtools.metrics.MetricsFile.write(MetricsFile.java:134) at picard.analysis.CollectGcBiasMetrics.finish(CollectGcBiasMetrics.java:171) at picard.analysis.SinglePassSamProgram.makeItSo(SinglePassSamProgram.java:133) at picard.analysis.SinglePassSamProgram.doWork(SinglePassSamProgram.java:53) at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:206) at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:95) at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:105)

dekling commented 9 years ago

Not sure about this... Can you try changing the Validation_Stringency to "Strict"?

tpshea2 commented 9 years ago

Thank you for your assistance.

Here is when using STRICT:

java -jar /seq/software/picard/current/bin/picard.jar CollectGcBiasMetrics CHART_OUTPUT=test.pdf INPUT=Smallfrag-PCRfree-Ecoli-dmc.ref_aligned.bam OUTPUT=test.metrics VALIDATION_STRINGENCY=STRICT REFERENCE_SEQUENCE=Escherichia_coli_K12_MG1655.fasta WINDOW_SIZE=100

[Thu Sep 17 12:23:50 EDT 2015] picard.analysis.CollectGcBiasMetrics CHART_OUTPUT=test.pdf WINDOW_SIZE=100 INPUT=Smallfrag-PCRfree-Ecoli-dmc.ref_aligned.bam OUTPUT=test.metrics VALIDATION_STRINGENCY=STRICT REFERENCE_SEQUENCE=Escherichia_coli_K12_MG1655.fasta MINIMUM_GENOME_FRACTION=1.0E-5 IS_BISULFITE_SEQUENCED=false METRIC_ACCUMULATION_LEVEL=[ALL_READS] ASSUME_SORTED=true STOP_AFTER=0 VERBOSITY=INFO QUIET=false COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json [Thu Sep 17 12:23:50 EDT 2015] Executing as tshea@stout on Linux 2.6.32-504.23.4.el6.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.6.0_29-b11; Picard version: 1.878(89618e408692ff6288c7c880658f32f16fcbec53_1441135673) IntelDeflater INFO 2015-09-17 12:23:59 SinglePassSamProgram Processed 1,000,000 records. Elapsed time: 00:00:07s. Time for last 1,000,000: 7s. Last read position: ecoli_K12_MG12655:597,747 INFO 2015-09-17 12:24:05 SinglePassSamProgram Processed 2,000,000 records. Elapsed time: 00:00:14s. Time for last 1,000,000: 6s. Last read position: ecoli_K12_MG12655:1,189,549 INFO 2015-09-17 12:24:12 SinglePassSamProgram Processed 3,000,000 records. Elapsed time: 00:00:20s. Time for last 1,000,000: 6s. Last read position: ecoli_K12_MG12655:1,750,372 INFO 2015-09-17 12:24:18 SinglePassSamProgram Processed 4,000,000 records. Elapsed time: 00:00:26s. Time for last 1,000,000: 6s. Last read position: ecoli_K12_MG12655:2,326,249 INFO 2015-09-17 12:24:24 SinglePassSamProgram Processed 5,000,000 records. Elapsed time: 00:00:33s. Time for last 1,000,000: 6s. Last read position: ecoli_K12_MG12655:2,930,170 INFO 2015-09-17 12:24:31 SinglePassSamProgram Processed 6,000,000 records. Elapsed time: 00:00:39s. Time for last 1,000,000: 6s. Last read position: ecoli_K12_MG12655:3,535,153 INFO 2015-09-17 12:24:37 SinglePassSamProgram Processed 7,000,000 records. Elapsed time: 00:00:46s. Time for last 1,000,000: 6s. Last read position: ecoli_K12MG12655:4,135,983 INFO 2015-09-17 12:24:43 SinglePassSamProgram Processed 8,000,000 records. Elapsed time: 00:00:52s. Time for last 1,000,000: 6s. Last read position: /_ [Thu Sep 17 12:24:46 EDT 2015] picard.analysis.CollectGcBiasMetrics done. Elapsed time: 0.93 minutes. Runtime.totalMemory()=2128150528 To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp Exception in thread "main" java.lang.NullPointerException at java.io.FileOutputStream.(FileOutputStream.java:186) at java.io.FileOutputStream.(FileOutputStream.java:145) at java.io.FileWriter.(FileWriter.java:73) at htsjdk.samtools.metrics.MetricsFile.write(MetricsFile.java:134) at picard.analysis.CollectGcBiasMetrics.finish(CollectGcBiasMetrics.java:171) at picard.analysis.SinglePassSamProgram.makeItSo(SinglePassSamProgram.java:133) at picard.analysis.SinglePassSamProgram.doWork(SinglePassSamProgram.java:53) at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:206) at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:95) at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:105)

On Thu, Sep 17, 2015 at 12:15 PM, David E. Kling notifications@github.com wrote:

Not sure about this... Can you try changing the Validation_Stringency to "Strict"?

— Reply to this email directly or view it on GitHub https://github.com/broadinstitute/picard/issues/288#issuecomment-141136665 .

yfarjoun commented 9 years ago

@kbergin , could you take a look?

kbergin commented 9 years ago

Hello!

In addition to specifying CHART_OUTPUT and OUTPUT, one also needs to specify a file name for SUMMARY_OUTPUT as an argument.

I think that will fix your issue, I will write more clear exception handling for this.

Thanks!

kbergin commented 9 years ago

I'm not sure why the command would work with an older version of gc bias if that is the issue, so let me know if that doesn't fix it.

kbergin commented 9 years ago

Looking back, because apparently summary output is actually an optional argument the old code used to check if it was null before trying to write to the file, which did not get carried over in the refactoring. That's my mistake, thank you for catching it!

tpshea2 commented 9 years ago

Thank you for the note on usage. I see that it now does run when providing SUMMARY_OUTPUT.

On Thu, Sep 17, 2015 at 2:18 PM, Kylee Bergin notifications@github.com wrote:

Looking back, because apparently summary output is actually an optional argument the old code used to check if it was null before trying to write to the file, which did not get carried over in the refactoring. That's my mistake, thank you for catching it!

— Reply to this email directly or view it on GitHub https://github.com/broadinstitute/picard/issues/288#issuecomment-141171896 .

kbergin commented 9 years ago

Great!

kbergin commented 9 years ago

This bug is being addressed in PR #295 and will be merged in upon code review. Will close after that.

kbergin commented 9 years ago

This has been merged in from PR #295.