samtools / htsjdk

A Java API for high-throughput sequencing data (HTS) formats.
http://samtools.github.io/htsjdk/
283 stars 242 forks source link

Improve exception message for unset VCF output type #1357

Closed clintval closed 5 years ago

clintval commented 5 years ago

Description

I, and fellow developers, hit this exception when we've accidentally used an improper VCF extension in an output file path (our most common mistake is using .gvcf). I think the exception message needs to be improved so it is clear that Picard is dynamically changing output file type based on the extension of a file path in most cases.

I think the exception bites new users and could be improved so the issue is easier to resolve.

What do you think? Happy to amend!

Current behavior:

❯ picard VcfFormatConverter I=/dev/stdin O=demo.gvcf REQUIRE_INDEX=false

[Sun Apr 14 10:14:39 PDT 2019] picard.vcf.VcfFormatConverter INPUT=/dev/stdin OUTPUT=demo.gvcf REQUIRE_INDEX=false VALIDATION_STRINGENCY=SILENT CREATE_INDEX=true    VERBOSITY=INFO QUIET=false COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json USE_JDK_DEFLATER=false USE_JDK_INFLATER=false
[Sun Apr 14 10:14:39 PDT 2019] Executing as cvalentine@cvalentine.local on Mac OS X 10.14.3 x86_64; OpenJDK 64-Bit Server VM 1.8.0_152-release-1056-b12; Deflater: Intel; Inflater: Intel; Picard version: 2.10.6-SNAPSHOT
[Sun Apr 14 10:14:39 PDT 2019] picard.vcf.VcfFormatConverter done. Elapsed time: 0.02 minutes.
Runtime.totalMemory()=514850816
To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
Exception in thread "main" java.lang.IllegalArgumentException: Must specify file or stream output type.
        at htsjdk.variant.variantcontext.writer.VariantContextWriterBuilder.build(VariantContextWriterBuilder.java:423)
        at picard.vcf.VcfFormatConverter.doWork(VcfFormatConverter.java:114)
        at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:228)
        at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:94)
        at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:104)

Checklist

codecov-io commented 5 years ago

Codecov Report

Merging #1357 into master will increase coverage by 0.01%. The diff coverage is 100%.

@@              Coverage Diff               @@
##              master     #1357      +/-   ##
==============================================
+ Coverage     67.848%   67.858%   +0.01%     
- Complexity      8283      8284       +1     
==============================================
  Files            563       563              
  Lines          33706     33707       +1     
  Branches        5657      5657              
==============================================
+ Hits           22869     22873       +4     
+ Misses          8659      8657       -2     
+ Partials        2178      2177       -1
Impacted Files Coverage Δ Complexity Δ
...antcontext/writer/VariantContextWriterBuilder.java 82.803% <100%> (+0.11%) 60 <0> (ø) :arrow_down:
...samtools/util/AsyncBlockCompressedInputStream.java 76% <0%> (+4%) 13% <0%> (+1%) :arrow_up:
clintval commented 5 years ago

Thanks @yfarjoun! I added a list of valid VCF extensions. Here is how the message renders:

Exception in thread "main" java.lang.IllegalArgumentException: Output format type is not set, or could not be inferred from the output path. If a path was used, does it have a valid VCF extension (.vcf, .vcf.gz, .bcf)?

I chose not to include the output filepath in the exception message because there can be many valid reasons why the output type is left unspecified and it may not be because of an output path.

I chose to suggest to the user that the extension of the file path may be the issue, since that is the most common case in my experience.

For example:

  1. A VariantContextBuilder can have the output type set programmatically, overriding any file extension suffix (if supplied).

https://github.com/samtools/htsjdk/blob/e8229de66d1916dc4d9e0f8e6603fc1f0101245e/src/main/java/htsjdk/variant/variantcontext/writer/VariantContextWriterBuilder.java#L98-L101

  1. The values for outStream and outPath can be null:

https://github.com/samtools/htsjdk/blob/e8229de66d1916dc4d9e0f8e6603fc1f0101245e/src/main/java/htsjdk/variant/variantcontext/writer/VariantContextWriterBuilder.java#L172-L177

https://github.com/samtools/htsjdk/blob/e8229de66d1916dc4d9e0f8e6603fc1f0101245e/src/main/java/htsjdk/variant/variantcontext/writer/VariantContextWriterBuilder.java#L214-L219