tseemann / snippy

:scissors: :zap: Rapid haploid variant calling and core genome alignment
GNU General Public License v2.0
462 stars 113 forks source link

.gb file not working as the reference and with fasta reference, FTYPE,STRAND,NT_POS, AA_POS , EFFECT,LOCUS_TAG,GENE PRODUCT are empty columns #546

Open absartalat opened 1 year ago

absartalat commented 1 year ago

I installed snippy v 4.6.0 and snpEff version SnpEff 5.1d (build 2022-04-19 15:49). When the reference file is a .gb file, I encounter the following error [19:30:07] This is snippy 4.6.0 [19:30:07] Written by Torsten Seemann [19:30:07] Obtained from https://github.com/tseemann/snippy [19:30:07] Detected operating system: linux [19:30:07] Enabling bundled linux tools. [19:30:07] Found bwa - /home/absar_ubuntu/anaconda3/envs/snippy/bin/bwa [19:30:07] Found bcftools - /home/absar_ubuntu/anaconda3/envs/snippy/bin/bcftools [19:30:07] Found samtools - /home/absar_ubuntu/anaconda3/envs/snippy/bin/samtools [19:30:07] Found java - /home/absar_ubuntu/anaconda3/envs/snippy/bin/java [19:30:07] Found snpEff - /home/absar_ubuntu/anaconda3/envs/snippy/bin/snpEff [19:30:07] Found samclip - /home/absar_ubuntu/anaconda3/envs/snippy/bin/samclip [19:30:07] Found seqtk - /home/absar_ubuntu/anaconda3/envs/snippy/bin/seqtk [19:30:07] Found parallel - /home/absar_ubuntu/anaconda3/envs/snippy/bin/parallel [19:30:07] Found freebayes - /home/absar_ubuntu/anaconda3/envs/snippy/bin/freebayes [19:30:07] Found freebayes-parallel - /home/absar_ubuntu/anaconda3/envs/snippy/bin/freebayes-parallel [19:30:07] Found fasta_generate_regions.py - /home/absar_ubuntu/anaconda3/envs/snippy/bin/fasta_generate_regions.py [19:30:07] Found vcfstreamsort - /home/absar_ubuntu/anaconda3/envs/snippy/bin/vcfstreamsort [19:30:07] Found vcfuniq - /home/absar_ubuntu/anaconda3/envs/snippy/bin/vcfuniq [19:30:07] Found vcffirstheader - /home/absar_ubuntu/anaconda3/envs/snippy/bin/vcffirstheader [19:30:07] Found gzip - /usr/bin/gzip [19:30:07] Found vt - /home/absar_ubuntu/anaconda3/envs/snippy/bin/vt [19:30:07] Found snippy-vcf_to_tab - /home/absar_ubuntu/anaconda3/envs/snippy/bin/snippy-vcf_to_tab [19:30:07] Found snippy-vcf_report - /home/absar_ubuntu/anaconda3/envs/snippy/bin/snippy-vcf_report [19:30:07] Checking version: samtools --version is >= 1.7 - ok, have 1.16 [19:30:07] Checking version: bcftools --version is >= 1.7 - ok, have 1.16 [19:30:07] Checking version: freebayes --version is >= 1.1 - ok, have 1.3.6 [19:30:07] Checking version: snpEff -version is >= 4.3 - ok, have 5.1 [19:30:07] Checking version: bwa is >= 0.7.12 - ok, have 0.7.17 [19:30:07] Using reference: /home/absar_ubuntu/NZ_CP042858.1.gb [19:30:07] Treating reference as 'genbank' format. [19:30:07] Will use 4 CPU cores. [19:30:07] Using read file: /home/absar_ubuntu/WGS_fasta/1.fna [19:30:07] Used --force, will overwrite existing SNP_rough1 [19:30:07] Changing working directory: SNP_rough1 [19:30:07] Creating reference folder: reference [19:30:07] Extracting FASTA and GFF from reference. [19:30:11] Wrote 1 sequences to ref.fa [19:30:11] Wrote 5799 features to ref.gff [19:30:11] Shredding /home/absar_ubuntu/WGS_fasta/1.fna into pseudo-reads [19:30:14] Wrote 466874 fake 250bp reads (20x, stride 25bp) to fake_reads.fq [19:30:14] Creating reference/snpeff.config [19:30:14] Freebayes will process 7 chunks of 842934 bp, 4 chunks at a time. [19:30:14] Using BAM RG (Read Group) ID: SNP_rough1 [19:30:14] Running: samtools faidx reference/ref.fa 2>> snps.log [19:30:14] Running: bwa index reference/ref.fa 2>> snps.log [19:30:16] Running: mkdir -p reference/genomes && cp -f reference/ref.fa reference/genomes/ref.fa 2>> snps.log [19:30:16] Running: ln -sf reference/ref.fa . 2>> snps.log [19:30:16] Running: ln -sf reference/ref.fa.fai . 2>> snps.log [19:30:16] Running: mkdir -p reference/ref && gzip -c reference/ref.gff > reference/ref/genes.gff.gz 2>> snps.log [19:30:16] Running: snpEff build -c reference/snpeff.config -dataDir . -gff3 ref 2>> snps.log [19:30:21] Error running command, check SNP_rough1/snps.log

Using .fasta file as reference gives empty column in the result.

famiji commented 1 year ago

i also meet this problem,is there anyway to solve it?if is, please let me know.

absartalat commented 1 year ago

@famiji When you download the GenBank version of the reference file from NCBI, select 'show GI' option too. That solved the error. Hope this helps!

absartalat commented 1 year ago

i also meet this problem,is there anyway to solve it?if is, please let me know.

@famiji You can try downgrading the snpeff version to 5.0 if still getting the problem. Check snpeff version first using snpEff -version If it's 5.1d, there is some incompatibility issues with snippy 4.6.0. Downgrade snpeff to 5.0, that worked for me. Command: conda activate Environment_name conda install snpeff=5.0 Hope this solves your issue.

stanikae commented 11 months ago

i also meet this problem,is there anyway to solve it?if is, please let me know.

@famiji You can try downgrading the snpeff version to 5.0 if still getting the problem. Check snpeff version first using snpEff -version If it's 5.1d, there is some incompatibility issues with snippy 4.6.0. Downgrade snpeff to 5.0, that worked for me. Command: conda activate Environment_name conda install snpeff=5.0 Hope this solves your issue.

This solution worked for me but might be necessary to include the channel when downgrading to v5.0: conda install -c bioconda snpeff=5.0