lindenb / jvarkit

Java utilities for Bioinformatics
https://jvarkit.readthedocs.io/
Other
482 stars 133 forks source link

Cannot use .fna reference file #181

Closed togop closed 3 years ago

togop commented 3 years ago

Subject of the issue

Try to run wgscoverageplotter.jar for a .bam file aligned against a reference genome in the .fna file format doesn't work.

Your environment

Steps to reproduce

My command is something like that: My command is like that: java -jar /path/to/wgscoverageplotter.jar -dimension 1500x500 -C -1 --clip -R /home/ubuntu/ref_genomes/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna my_minimap2.coordsorted.bam--include-contig-regex "chr(\d+|X|Y|M)$" --percentile median > my_coverage.svg

Reference genome was downloaded form: ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/001/405/GCA_000001405.15_GRCh38/seqs_for_alignment_pipelines.ucsc_ids/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna.gz

returns exception: [SEVERE][WGSCoveragePlotter]Could not find dictionary next to reference file file:///home/ubuntu/ref_genomes/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna htsjdk.samtools.SAMException: Could not find dictionary next to reference file file:///home/ubuntu/ref_genomes/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna at htsjdk.variant.utils.SAMSequenceDictionaryExtractor$TYPE$1.extractDictionary(SAMSequenceDictionaryExtractor.java:58) at htsjdk.variant.utils.SAMSequenceDictionaryExtractor.extractDictionary(SAMSequenceDictionaryExtractor.java:170) at com.github.lindenb.jvarkit.util.bio.SequenceDictionaryUtils.extractRequired(SequenceDictionaryUtils.java:178) at com.github.lindenb.jvarkit.tools.bam2graphics.WGSCoveragePlotter.doWork(WGSCoveragePlotter.java:289) at com.github.lindenb.jvarkit.util.jcommander.Launcher.instanceMain(Launcher.java:796) at com.github.lindenb.jvarkit.util.jcommander.Launcher.instanceMainWithExit(Launcher.java:959) at com.github.lindenb.jvarkit.tools.bam2graphics.WGSCoveragePlotter.main(WGSCoveragePlotter.java:610) [INFO][Launcher]wgscoverageplotter Exited with failure (-1)

Expected behaviour

To produce expected plots

Actual behaviour

Program crash with the above exception

lindenb commented 3 years ago

Could not find dictionary next to reference file file:///home/ubuntu/ref_genomes/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna

as stated in the documentation, the reference must be associated with a *.dict file:

http://lindenb.github.io/jvarkit/WGSCoveragePlotter.html

  * -R, --reference
      Indexed fasta Reference file. This file must be indexed with samtools 
      faidx and with picard CreateSequenceDictionary
togop commented 3 years ago

Thanks, that solved the issue