tseemann / nullarbor

:floppy_disk: :page_with_curl: "Reads to report" for public health and clinical microbiology
GNU General Public License v2.0
134 stars 37 forks source link

[fai_fetch_seq] Error: fai_fetch failed. #294

Closed spencer411 closed 2 years ago

spencer411 commented 2 years ago

I get the following error when running the pipeline which results in no SNPs. Note that I installed with conda and have everything setup as described in the usage instructions. Any inference as to what is going on is appreciated.

[15:28:30] Running: bcftools view --include 'FMT/GT="1/1" && QUAL>=100 && FMT/DP>=10 && (FMT/AO)/(FMT/DP)>=0' snps.raw.vcf | vt normalize -r reference/ref.fa - | bcftools annotate --remove '^INFO/TYPE,^INFO/DP,^INFO/RO,^INFO/AO,^INFO/AB,^FORMAT/GT,^FORMAT/DP,^FORMAT/RO,^FORMAT/AO,^FORMAT/QR,^FORMAT/QA,^FORMAT/GL' > snps.filt.vcf 2>> snps.log normalize v0.5

options: input VCF file - [o] output VCF file - [w] sorting window size 10000 [n] no fail on reference inconsistency for non SNPs false [q] quiet false [d] debug false [r] reference FASTA file reference/ref.fa

[fai_fetch_seq] Error: fai_fetch failed. (Seeking in a compressed, .gzi unindexed, file?) [variant_manip.cpp:67 is_ref_consistent] failure to extract base from fasta file: NZ_CP012480.1:3058-3072 FAQ: http://genome.sph.umich.edu/wiki/Vt#1._vt_cannot_retrieve_sequences_from_my_reference_sequence_file

I can certainly provide more of the log file if that is helpful.

spencer411 commented 2 years ago

I figured it out. For some reason I think the two periods in the reference file were causing the problem. I changed the name of the reference from NZ_CP012480.1.fna to NZ_CP012480_1.fna and everything ran smoothly.

In case anyone else has the same issue....