Closed hidvegin closed 3 years ago
Does this work if you remove the --sam-nohead
option from the bowtie command?
Thank @BenLangmead for your answer. I tried it, what you mention but I get the very same error message:
[E::sam_hrecs_error] Missing tab at line 256025: "@ST-E00523:628:H2WJ5CCX2:2:1101:13859:1520 0 2309 41991 255 21M * 0 0 NATGAGATACGGACACCGGCC IIIIIIIIIIIIIIIIIIIII XA:i:1 MD:Z:0T19T0 NM:i:2 XM:i:2"
samtools view: failed to add PG line to the header
This code was used:
bowtie -n 1 -l 10 -m 100 -k 1 --best --strata -p 24 -f /HDD/lculinaris/smallRNA/clean_data/LC01.clean.fa.gz -x /HDD/lculinaris/bowtie_index/lculinaris --sam | samtools view -bS -o LC01_clean.bam
Hello @hidvegin
This appears to be a bug in samtools
where it thinks that the first read is a continuation of the header, or start of the header in the case of --sam-nohead
and expects there to be a TAB after @ST
but instead finds a hyphen. A temporary solution may be to remove the '@'
character from the first the read. Your resulting file should look something like this:
>ST-E00523:628:H2WJ5CCX2:2:1101:14945:1221 1:N:0:NAGCTGAA
NGAAGCGGAAGCGAGAATG
>@ST-E00523:628:H2WJ5CCX2:2:1101:17462:1221 1:N:0:NAGCTGAA
NGGGATGTAGCTCAGA
>@ST-E00523:628:H2WJ5CCX2:2:1101:21501:1221 1:N:0:NAGCTGAA
NGGGATGTAGCTCAAA
>@ST-E00523:628:H2WJ5CCX2:2:1101:2696:1309 1:N:0:NAGCTGAA
NGGGATGTAGCTCAAA
>@ST-E00523:628:H2WJ5CCX2:2:1101:6492:1309 1:N:0:NAGCTGAA
NCAACCGACTTAGAAC
>@ST-E00523:628:H2WJ5CCX2:2:1101:9191:1309 1:N:0:NAGCTGAA
NTGTGGATCTTTGACGGCAACGTAAA
>@ST-E00523:628:H2WJ5CCX2:2:1101:14407:1309 1:N:0:NAGCTGAA
NATAGACTTGAAC
>@ST-E00523:628:H2WJ5CCX2:2:1101:14955:1309 1:N:0:NAGCTGAA
NGGGATGTAGCTCAGA
>@ST-E00523:628:H2WJ5CCX2:2:1101:15747:1309 1:N:0:NAGCTGAA
NGGGATGTAGCTCAGA
Please let us know if this works for you.
Hello @ch4rr0
Thank you for your answer. I tried what you suggested to me, but it did not work. I get this error message:
[E::sam_hrecs_error] Missing tab at line 256025: "@ST-E00523:628:H2WJ5CCX2:2:1101:9526:1713 0 190533 6700 255 24M * 0 0 AAATCCGTCTGATCTGAATCGATC IIIIIIIIIIIIIIIIIIIIIIII XA:i:0 MD:Z:24 NM:i:0 XM:i:2"
samtools view: failed to add PG line to the header
Here is the command line that I used on the modified input file. I kept only the arguments that were necessary to recreate the issue:
$ ./bowtie indexes/e_coli -f in.fa -S | samtools view -bS -o out.bam
Setting the index via positional argument will be deprecated in a future release. Please use -x option instead.
# reads processed: 9
# reads with at least one alignment: 3 (33.33%)
# reads that failed to align: 6 (66.67%)
Reported 3 alignments
$ file out.bam
out.bam: gzip compressed data, extra field, original size modulo 2^32 0
It worked for me, if I use a short index file (example miRBase mature small RNAs), but when I use a 4 Gbp reference genome index file which I made with bowtie
1.3.0, I get the same error message. Also, in bowtie2
and hisat2
, the samtools
is worked for me very well with the latest version. bowtie
1.3.0 is the only one, which is not working with samtools
.
I am getting the same error when using bowtie2
when using the original input sample:
$ ./bowtie2-align-s -x example/index/lambda_virus -f ../bowtie/in.fa | samtools view -b
9 reads; of these:
9 (100.00%) were unpaired; of these:
9 (100.00%) aligned 0 times
0 (0.00%) aligned exactly 1 time
0 (0.00%) aligned >1 times
0.00% overall alignment rate
[E::sam_hrecs_error] Missing tab at line 4: "@ST-E00523:628:H2WJ5CCX2:2:1101:14945:1221 4 * 0 0 * * 0 0 NGAAGCGGAAGCGAGAATG IIIIIIIIIIIIIIIIIIIYT:Z:UU YF:Z:SC"
samtools view: failed to add PG line to the header
$ ./bowtie2-align-s -x example/index/lambda_virus -f ../bowtie/in.fa --sam-nohead | samtools view -b
9 reads; of these:
9 (100.00%) were unpaired; of these:
9 (100.00%) aligned 0 times
0 (0.00%) aligned exactly 1 time
0 (0.00%) aligned >1 times
0.00% overall alignment rate
[E::sam_hrecs_error] Missing tab at line 1: "@ST-E00523:628:H2WJ5CCX2:2:1101:14945:1221 4 * 0 0 * * 0 0 NGAAGCGGAAGCGAGAATG IIIIIIIIIIIIIIIIIIIYT:Z:UU YF:Z:SC"
samtools view: failed to add PG line to the header
Would it be possible to share a sample of the index?
Yes, here is the link for the index files: LINK With this index files I get the error message.
Also, I tried different small RNA libraries (14 different libraries) align to the miRBase mature index file and I recognized that for 11 libraries worked your suggestion but for 3 libraries did not work. So, it looks that the '@' character deletion in the firs read, is not working for all of libraries.
Hello,
I was not able to download the read file using the link provided however I would like to recommend the below command, which should work regardless of input:
bowtie -n 1 -l 10 -m 100 -k 1 --best --strata -p 24 -f /HDD/lculinaris/smallRNA/clean_data/LC01.clean.fa.gz -x /HDD/lculinaris/bowtie_index/lculinaris --sam | samtools view -b --no-PG
The only significant difference is the addition of the --no-PG
flag.
Hello @ch4rr0,
Thank you for your answer. With your help, samtools
worked very well and generated the bam files.
I used bowtie 1.3.0 for generate small RNA-seq aligments to reference genome. I used this parameters:
bowtie -n 1 -l 10 -m 100 -k 1 --best --strata -p 24 -f /HDD/lculinaris/smallRNA/clean_data/LC01.clean.fa.gz -x /HDD/lculinaris/bowtie_index/lculinaris --sam --sam-nohead | samtools view -bS -o LC01_clean.bam
I would like to generate bam files with
samtools view
on the fly but I get this error message:The reads in fasta file format, example:
Please help me, how could I resolve this issue.