Open Rathanin-github opened 4 years ago
Hi @Rathanin-github, I also got this issue with a recent re-install of snippy v4.6.0 within the last 2 weeks.
For me, the variants disappear right at this point:
bcftools view --include 'FMT/GT="1/1" && QUAL>=100 && FMT/DP>=10 && (FMT/AO)/(FMT/DP)>=0' snps.raw.vcf | vt normalize -r reference/ref.fa -
I used conda to install snippy, and the dependency [vt
] has very recently been updated (Anaconda release log) so I suspected that was the problem. My solution was to downgrade vt to an older, stable version and do a fresh install of snippy:
conda create --name snippy-env
conda install --name snippy-env -c bioconda/label/cf201901 vt==2015.11.10
conda install --name snippy-env -c bioconda snippy
Let me know if this fix is applicable to you, or if you can find a better solution!
Also having the same problem, but using older version (4.3.5). I have zero SNPS. I am attaching the log for reference.
This seems to be the error source:
Parse error at line 5 samtools sort: truncated file. Aborting [fputs] Broken pipe
[02:16:39] This is snippy 4.3.5 [02:16:39] Written by Torsten Seemann [02:16:39] Obtained from https://github.com/tseemann/snippy
[02:16:39] Found bwa - /miniconda3/envs/snippy/bin/bwa [02:16:39] Found bcftools - /miniconda3/envs/snippy/bin/bcftools [02:16:39] Found samtools -/miniconda3/envs/snippy/bin/samtools [02:16:39] Found java - /miniconda3/envs/snippy/bin/java [02:16:39] Found snpEff - /miniconda3/envs/snippy/bin/snpEff [02:16:39] Found samclip - /miniconda3/envs/snippy/bin/samclip [02:16:39] Found seqtk - /miniconda3/envs/snippy/bin/seqtk [02:16:39] Found parallel -/miniconda3/envs/snippy/bin/parallel [02:16:39] Found freebayes -miniconda3/envs/snippy/bin/freebayes [02:16:39] Found freebayes-parallel - /miniconda3/envs/snippy/bin/freebayes-parallel [02:16:39] Found fasta_generate_regions.py - /miniconda3/envs/snippy/bin/fasta_generate_regions.py [02:16:39] Found vcfstreamsort - /miniconda3/envs/snippy/bin/vcfstreamsort [02:16:39] Found vcfuniq - /miniconda3/envs/snippy/bin/vcfuniq [02:16:39] Found vcffirstheader -/miniconda3/envs/snippy/bin/vcffirstheader [02:16:39] Found gzip - /usr/bin/gzip [02:16:39] Found vt - /miniconda3/envs/snippy/bin/vt [02:16:39] Found snippy-vcf_to_tab - /miniconda3/envs/snippy/bin/snippy-vcf_to_tab [02:16:39] Found snippy-vcf_report - /miniconda3/envs/snippy/bin/snippy-vcf_report [02:16:39] Checking version: samtools --version is >= 1.7 - ok, have 1.9 [02:16:39] Checking version: bcftools --version is >= 1.7 - ok, have 1.9 [02:16:39] Checking version: freebayes --version is >= 1.1 - ok, have 1.2 [02:16:39] Checking version: snpEff -version is >= 4.3 - ok, have 4.3 [02:16:39] Using reference: /cogseq.fa.split/wuhan.gbk [02:16:39] Treating reference as 'genbank' format. [02:16:39] Will use 8 CPU cores. [02:16:39] Using read file: /fasta_COGS/cogseq.fa.split/England2013800422020.fa [02:16:39] Creating folder: England2013800422020 [02:16:39] Changing working directory: England2013800422020 [02:16:39] Creating reference folder: reference [02:16:39] Extracting FASTA and GFF from reference. [02:16:39] Wrote 1 sequences to ref.fa [02:16:39] Wrote 12 features to ref.gff [02:16:39] Shredding /cogseq.fa.split/England2013800422020.fa into pseudo-reads [02:16:39] Wrote 2434 fake 250bp reads (20x, stride 25bp) to fake_reads.fq [02:16:39] Creating reference/snpeff.config [02:16:39] Freebayes will process 15 chunks of 2032 bp, 8 chunks at a time. [02:16:39] Using BAM RG (Read Group) ID: England2013800422020 [02:16:39] Running: samtools faidx reference/ref.fa 2>> snps.log [02:16:39] Running: bwa index reference/ref.fa 2>> snps.log [02:16:39] Running: mkdir -p reference/genomes && cp -f reference/ref.fa reference/genomes/ref.fa 2>> snps.log [02:16:39] Running: ln -sf reference/ref.fa . 2>> snps.log [02:16:39] Running: ln -sf reference/ref.fa.fai . 2>> snps.log [02:16:39] Running: mkdir -p reference/ref && gzip -c reference/ref.gff > reference/ref/genes.gff.gz 2>> snps.log [02:16:39] Running: snpEff build -c reference/snpeff.config -dataDir . -gff3 ref 2>> snps.log [02:16:41] Running: bwa mem -Y -M -R '@RG\tID:England2013800422020\tSM:England2013800422020' -t 8 reference/ref.fa fake_reads.fq | samclip --max 10 --ref reference/ref.fa.fai | samtools sort -n -l 0 -T /var/folders/s1/05bh_q_s15n3tdrqnn7r0ktr0000gn/T//snippy.15470. --threads 8 -m 1000M | samtools fixmate -m - - | samtools sort -l 0 -T /var/folders/s1/05bh_q_s15n3tdrqnn7r0ktr0000gn/T//snippy.15470. --threads 8 -m 1000M | samtools markdup -T /var/folders/s1/05bh_q_s15n3tdrqnn7r0ktr0000gn/T//snippy.15470. -r -s - - > snps.bam 2>> snps.log [M::bwa_idx_load_from_disk] read 0 ALT contigs [M::process] read 2434 sequences (608500 bp)... [M::mem_process_seqs] Processed 2434 reads in 0.187 CPU sec, 0.027 real sec [samclip] samclip 0.2 by Torsten Seemann (@torstenseemann) [samclip] Loading: reference/ref.fa.fai [samclip] Found 1 sequences in reference/ref.fa.fai [E::sam_parse1] CIGAR and query sequence are of different length [W::sam_read1] Parse error at line 5 samtools sort: truncated file. Aborting [fputs] Broken pipe [02:16:41] Running: samtools index snps.bam 2>> snps.log [02:16:41] Running: fasta_generate_regions.py reference/ref.fa.fai 2032 > reference/ref.txt 2>> snps.log [02:16:42] Running: freebayes-parallel reference/ref.txt 8 -p 2 -P 0 -C 10 --min-repeat-entropy 1.5 --strict-vcf -q 13 -m 60 --min-coverage 10 -F 0.05 -f reference/ref.fa snps.bam > snps.raw.vcf 2>> snps.log [02:16:42] Running: bcftools view --include 'FMT/GT="1/1" && QUAL>=100 && FMT/DP>=10 && (FMT/AO)/(FMT/DP)>=0' snps.raw.vcf | vt normalize -r reference/ref.fa - | bcftools annotate --remove '^INFO/TYPE,^INFO/DP,^INFO/RO,^INFO/AO,^INFO/AB,^FORMAT/GT,^FORMAT/DP,^FORMAT/RO,^FORMAT/AO,^FORMAT/QR,^FORMAT/QA,^FORMAT/GL' > snps.filt.vcf 2>> snps.log normalize v0.5
options: input VCF file - [o] output VCF file - [w] sorting window size 10000 [m] no fail on masked reference inconsistency false [n] no fail on reference inconsistency false [q] quiet false [d] debug false [r] reference FASTA file reference/ref.fa
stats: biallelic no. left trimmed : 0 no. right trimmed : 0 no. left and right trimmed : 0 no. right trimmed and left aligned : 0 no. left aligned : 0
total no. biallelic normalized : 0
multiallelic
no. left trimmed : 0
no. right trimmed : 0
no. left and right trimmed : 0
no. right trimmed and left aligned : 0
no. left aligned : 0
total no. multiallelic normalized : 0
total no. variants normalized : 0
total no. variants observed : 0
total no. reference observed : 0
Time elapsed: 0.00s
Hi Katherine Thank you so much for your solution. I can fix my problem by:
conda create --name snippy-env conda install --name snippy-env -c bioconda/label/cf201901 vt==2015.11.10 conda install --name snippy-env -c bioconda snippy
Best regards
Rathanin
On Tue, Jul 28, 2020 at 4:42 PM Darwin Bandoy notifications@github.com wrote:
Also having the same problem, but using older version (4.3.5). I have zero SNPS. I am attaching the log for reference.
This seems to be the error source:
Parse error at line 5 samtools sort: truncated file. Aborting [fputs] Broken pipe
[02:16:39] This is snippy 4.3.5 [02:16:39] Written by Torsten Seemann [02:16:39] Obtained from https://github.com/tseemann/snippy
[02:16:39] Found bwa - /miniconda3/envs/snippy/bin/bwa [02:16:39] Found bcftools - /miniconda3/envs/snippy/bin/bcftools [02:16:39] Found samtools -/miniconda3/envs/snippy/bin/samtools [02:16:39] Found java - /miniconda3/envs/snippy/bin/java [02:16:39] Found snpEff - /miniconda3/envs/snippy/bin/snpEff [02:16:39] Found samclip - /miniconda3/envs/snippy/bin/samclip [02:16:39] Found seqtk - /miniconda3/envs/snippy/bin/seqtk [02:16:39] Found parallel -/miniconda3/envs/snippy/bin/parallel [02:16:39] Found freebayes -miniconda3/envs/snippy/bin/freebayes [02:16:39] Found freebayes-parallel - /miniconda3/envs/snippy/bin/freebayes-parallel [02:16:39] Found fasta_generate_regions.py - /miniconda3/envs/snippy/bin/fasta_generate_regions.py [02:16:39] Found vcfstreamsort - /miniconda3/envs/snippy/bin/vcfstreamsort [02:16:39] Found vcfuniq - /miniconda3/envs/snippy/bin/vcfuniq [02:16:39] Found vcffirstheader -/miniconda3/envs/snippy/bin/vcffirstheader [02:16:39] Found gzip - /usr/bin/gzip [02:16:39] Found vt - /miniconda3/envs/snippy/bin/vt [02:16:39] Found snippy-vcf_to_tab - /miniconda3/envs/snippy/bin/snippy-vcf_to_tab [02:16:39] Found snippy-vcf_report - /miniconda3/envs/snippy/bin/snippy-vcf_report [02:16:39] Checking version: samtools --version is >= 1.7 - ok, have 1.9 [02:16:39] Checking version: bcftools --version is >= 1.7 - ok, have 1.9 [02:16:39] Checking version: freebayes --version is >= 1.1 - ok, have 1.2 [02:16:39] Checking version: snpEff -version is >= 4.3 - ok, have 4.3 [02:16:39] Using reference: /cogseq.fa.split/wuhan.gbk [02:16:39] Treating reference as 'genbank' format. [02:16:39] Will use 8 CPU cores. [02:16:39] Using read file: /fasta_COGS/cogseq.fa.split/England2013800422020.fa [02:16:39] Creating folder: England2013800422020 [02:16:39] Changing working directory: England2013800422020 [02:16:39] Creating reference folder: reference [02:16:39] Extracting FASTA and GFF from reference. [02:16:39] Wrote 1 sequences to ref.fa [02:16:39] Wrote 12 features to ref.gff [02:16:39] Shredding /cogseq.fa.split/England2013800422020.fa into pseudo-reads [02:16:39] Wrote 2434 fake 250bp reads (20x, stride 25bp) to fake_reads.fq [02:16:39] Creating reference/snpeff.config [02:16:39] Freebayes will process 15 chunks of 2032 bp, 8 chunks at a time. [02:16:39] Using BAM RG (Read Group) ID: England2013800422020 [02:16:39] Running: samtools faidx reference/ref.fa 2>> snps.log [02:16:39] Running: bwa index reference/ref.fa 2>> snps.log [02:16:39] Running: mkdir -p reference/genomes && cp -f reference/ref.fa reference/genomes/ref.fa 2>> snps.log [02:16:39] Running: ln -sf reference/ref.fa . 2>> snps.log [02:16:39] Running: ln -sf reference/ref.fa.fai . 2>> snps.log [02:16:39] Running: mkdir -p reference/ref && gzip -c reference/ref.gff > reference/ref/genes.gff.gz 2>> snps.log [02:16:39] Running: snpEff build -c reference/snpeff.config -dataDir . -gff3 ref 2>> snps.log [02:16:41] Running: bwa mem -Y -M -R '@rg https://github.com/rg\tID:England2013800422020\tSM:England2013800422020' -t 8 reference/ref.fa fake_reads.fq | samclip --max 10 --ref reference/ref.fa.fai | samtools sort -n -l 0 -T /var/folders/s1/05bh_q_s15n3tdrqnn7r0ktr0000gn/T//snippy.15470. --threads 8 -m 1000M | samtools fixmate -m - - | samtools sort -l 0 -T /var/folders/s1/05bh_q_s15n3tdrqnn7r0ktr0000gn/T//snippy.15470. --threads 8 -m 1000M | samtools markdup -T /var/folders/s1/05bh_q_s15n3tdrqnn7r0ktr0000gn/T//snippy.15470. -r -s - - > snps.bam 2>> snps.log [M::bwa_idx_load_from_disk] read 0 ALT contigs [M::process] read 2434 sequences (608500 bp)... [M::mem_process_seqs] Processed 2434 reads in 0.187 CPU sec, 0.027 real sec [samclip] samclip 0.2 by Torsten Seemann (@torstenseemann) [samclip] Loading: reference/ref.fa.fai [samclip] Found 1 sequences in reference/ref.fa.fai [E::sam_parse1] CIGAR and query sequence are of different length [W::sam_read1] Parse error at line 5 samtools sort: truncated file. Aborting [fputs] Broken pipe [02:16:41] Running: samtools index snps.bam 2>> snps.log [02:16:41] Running: fasta_generate_regions.py reference/ref.fa.fai 2032 > reference/ref.txt 2>> snps.log [02:16:42] Running: freebayes-parallel reference/ref.txt 8 -p 2 -P 0 -C 10 --min-repeat-entropy 1.5 --strict-vcf -q 13 -m 60 --min-coverage 10 -F 0.05 -f reference/ref.fa snps.bam > snps.raw.vcf 2>> snps.log [02:16:42] Running: bcftools view --include 'FMT/GT="1/1" && QUAL>=100 && FMT/DP>=10 && (FMT/AO)/(FMT/DP)>=0' snps.raw.vcf | vt normalize -r reference/ref.fa - | bcftools annotate --remove '^INFO/TYPE,^INFO/DP,^INFO/RO,^INFO/AO,^INFO/AB,^FORMAT/GT,^FORMAT/DP,^FORMAT/RO,^FORMAT/AO,^FORMAT/QR,^FORMAT/QA,^FORMAT/GL'
snps.filt.vcf 2>> snps.log normalize v0.5
options: input VCF file - [o] output VCF file - [w] sorting window size 10000 [m] no fail on masked reference inconsistency false [n] no fail on reference inconsistency false [q] quiet false [d] debug false [r] reference FASTA file reference/ref.fa
stats: biallelic no. left trimmed : 0 no. right trimmed : 0 no. left and right trimmed : 0 no. right trimmed and left aligned : 0 no. left aligned : 0
total no. biallelic normalized : 0
multiallelic no. left trimmed : 0 no. right trimmed : 0 no. left and right trimmed : 0 no. right trimmed and left aligned : 0 no. left aligned : 0
total no. multiallelic normalized : 0
total no. variants normalized : 0 total no. variants observed : 0 total no. reference observed : 0
Time elapsed: 0.00s
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tseemann/snippy/issues/410#issuecomment-664930784, or unsubscribe https://github.com/notifications/unsubscribe-auth/AM6JNDFSCQT7OD4XER3VWY3R52MRZANCNFSM4PGNE7EQ .
Hi @Rathanin-github, I also got this issue with a recent re-install of snippy v4.6.0 within the last 2 weeks.
For me, the variants disappear right at this point:
bcftools view --include 'FMT/GT="1/1" && QUAL>=100 && FMT/DP>=10 && (FMT/AO)/(FMT/DP)>=0' snps.raw.vcf | vt normalize -r reference/ref.fa -
I used conda to install snippy, and the dependency [
vt
] has very recently been updated (Anaconda release log) so I suspected that was the problem. My solution was to downgrade vt to an older, stable version and do a fresh install of snippy:conda create --name snippy-env conda install --name snippy-env -c bioconda/label/cf201901 vt==2015.11.10 conda install --name snippy-env -c bioconda snippy
Let me know if this fix is applicable to you, or if you can find a better solution!
This fix was applicable for me. Thanks!
Rolling back vt
to an older version also did it for me.
conda create -n snippy_env snippy vt=0.57721
Hello. Yes, we are aware of this bug now. Working towards fixing it. Thank you everyone.
Any updates on this?
Hi @tseemann When I update my snippy to 4.6.0, I got 0 varaint after runing. snps.csv contain nothing. Please see my snps.log below Your response is greatly apriciate !
echo snippy 4.6.0
cd /mnt/e/test4/17A
/home/nc-bioinfo/anaconda3/envs/snippy/bin/snippy --outdir mysnps17A.fa --ref K96243.fasta -R1 60-101_S1_L001_R1_001.fastq.gz -R2 60-101_S1_L001_R2_001.fastq.gz
samtools faidx reference/ref.fa
bwa index reference/ref.fa
[bwa_index] Pack FASTA... 0.05 sec [bwa_index] Construct BWT for the packed sequence... [bwa_index] 1.41 seconds elapse. [bwa_index] Update BWT... 0.03 sec [bwa_index] Pack forward-only FASTA... 0.02 sec [bwa_index] Construct SA from BWT and Occ... 0.50 sec [main] Version: 0.7.17-r1188 [main] CMD: bwa index reference/ref.fa [main] Real time: 4.973 sec; CPU: 2.000 sec
mkdir -p reference/genomes && cp -f reference/ref.fa reference/genomes/ref.fa
ln -sf reference/ref.fa .
ln -sf reference/ref.fa.fai .
mkdir -p reference/ref && gzip -c reference/ref.gff > reference/ref/genes.gff.gz
bwa mem -Y -M -R '@RG\tID:mysnps17A.fa\tSM:mysnps17A.fa' -t 8 reference/ref.fa /mnt/e/test4/17A/60-101_S1_L001_R1_001.fastq.gz /mnt/e/test4/17A/60-101_S1_L001_R2_001.fastq.gz | samclip --max 10 --ref reference/ref.fa.fai | samtools
sort -n -l 0 -T /tmp --threads 3 -m 2000M | samtools fixmate -m --threads 3 - - | samtools sort -l 0 -T /tmp --threads 3 -m 2000M | samtools markdup -T /tmp --threads 3 -r -s - - > snps.bam
COMMAND: samtools markdup -T /tmp --threads 3 -r -s - - READ: 3462693 WRITTEN: 3370831 EXCLUDED: 87497 EXAMINED: 3375196 PAIRED: 3318838 SINGLE: 56358 DUPLICATE PAIR: 57848 DUPLICATE SINGLE: 34014 DUPLICATE PAIR OPTICAL: 0 DUPLICATE SINGLE OPTICAL: 0 DUPLICATE NON PRIMARY: 0 DUPLICATE NON PRIMARY OPTICAL: 0 DUPLICATE PRIMARY TOTAL: 91862 DUPLICATE TOTAL: 91862 ESTIMATED_LIBRARY_SIZE: 47047081
samtools index snps.bam
fasta_generate_regions.py reference/ref.fa.fai 491233 > reference/ref.txt
freebayes-parallel reference/ref.txt 8 -p 2 -P 0 -C 2 -F 0.05 --min-coverage 10 --min-repeat-entropy 1.0 -q 13 -m 60 --strict-vcf -f reference/ref.fa snps.bam > snps.raw.vcf
sleep: cannot read realtime clock: Invalid argument sleep: cannot read realtime clock: Invalid argument sleep: cannot read realtime clock: Invalid argument sleep: cannot read realtime clock: Invalid argument sleep: cannot read realtime clock: Invalid argument sleep: cannot read realtime clock: Invalid argument sleep: cannot read realtime clock: Invalid argument sleep: cannot read realtime clock: Invalid argument
bcftools view --include 'FMT/GT="1/1" && QUAL>=100 && FMT/DP>=10 && (FMT/AO)/(FMT/DP)>=0' snps.raw.vcf | vt normalize -r reference/ref.fa - | bcftools annotate --remove '^INFO/TYPE,^INFO/DP,^INFO/RO,^INFO/AO,^INFO/AB,^FORMAT/GT,^FOR
MAT/DP,^FORMAT/RO,^FORMAT/AO,^FORMAT/QR,^FORMAT/QA,^FORMAT/GL' > snps.filt.vcf
cp snps.filt.vcf snps.vcf
/home/nc-bioinfo/anaconda3/envs/snippy/bin/snippy-vcf_to_tab --gff reference/ref.gff --ref reference/ref.fa --vcf snps.vcf > snps.tab
Loading reference: reference/ref.fa Loaded 2 sequences. Loading features: reference/ref.gff Parsing variants: snps.vcf Converted 0 SNPs to TAB format.
/home/nc-bioinfo/anaconda3/envs/snippy/bin/snippy-vcf_extract_subs snps.filt.vcf > snps.subs.vcf
bcftools convert -Oz -o snps.vcf.gz snps.vcf
bcftools index -f snps.vcf.gz
bcftools consensus --sample mysnps17A.fa -f reference/ref.fa -o snps.consensus.fa snps.vcf.gz
Applied 0 variants
bcftools convert -Oz -o snps.subs.vcf.gz snps.subs.vcf
bcftools index -f snps.subs.vcf.gz
bcftools consensus --sample mysnps17A.fa -f reference/ref.fa -o snps.consensus.subs.fa snps.subs.vcf.gz
Applied 0 variants
rm -f snps.subs.vcf.gz snps.subs.vcf.gz.csi snps.subs.vcf.gz.tbi