Closed tarunaaggarwal closed 7 years ago
I encountered the very same bug. But it is not a bug of the blobtools but may very well be a bug of your samtool installation.
The problem lies in BtIO.py
line 197
reads_mapped = int(reads_mapped_re.search(output).group(1))
here the python script reads the output of the command samtools flagstat mapping_1.bam
in older versions of samtools (0.1.x) this command gives the following output:
37278 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 duplicates
37278 + 0 mapped (100.00%:-nan%)
37278 + 0 paired in sequencing
18714 + 0 read1
18564 + 0 read2
28268 + 0 properly paired (75.83%:-nan%)
33716 + 0 with itself and mate mapped
3562 + 0 singletons (9.56%:-nan%)
2649 + 0 with mate mapped to a different chr
2578 + 0 with mate mapped to a different chr (mapQ>=5)
The comparator which was initialized in BtIO.py
for "secondary reads" but these are not in the output of samtools flagstat. Thus, the variable reads_mapped
can not be initialized and python throws an error.
The same command with samtools version 1.3 (as recommended on https://blobtools.readme.io/docs/getting-started ) gives the following output:
15313 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
216 + 0 supplementary
0 + 0 duplicates
15313 + 0 mapped (100.00% : N/A)
15097 + 0 paired in sequencing
7554 + 0 read1
7543 + 0 read2
13356 + 0 properly paired (88.47% : N/A)
14320 + 0 with itself and mate mapped
777 + 0 singletons (5.15% : N/A)
340 + 0 with mate mapped to a different chr
298 + 0 with mate mapped to a different chr (mapQ>=5)
See here we have the "secondary reads" and everything works just fine.
Bottom line just install samtools v1.3 perferably from this source: http://www.htslib.org/download/
Alternatively the creators of this great package could also include a fail-safe check for the installed samtools version.
Best, chris
Hi,
@chriskraus0 thanks for that excellent and detailed post!
yes, that is the root of the problem.
Will put in a version check as soon as I have some free time. That function needs some work either way...
@chriskraus0 : btw, do you have an opinion on default samflags to filter a BAM file on, currently it BAM files get filtered with
samtools view -f 1 -F 1024 -F 256 -F 2048
And maybe I should put that in the samtools flagstat
command as well so I could avoid parsing secondary-read-counts alltogether.
cheers,
dom
Thanks Chris and Dom! Appreciate your help! Samtools issue is fixed for now :-)
Hi,
I have the same issue.
I ran this command: blobtools map2cov -i /db/ana/programs/blobology/trinity_petrosia_complete_short_names.fasta -b /db/ana/programs/blobology/trinity_petrosia_complete_short_names.fasta.readsname.bowtie2.bam -o out
And this is the output:
[STATUS] : Parsing FASTA - /db/ana/programs/blobology/trinity_petrosia_complete_short_names.fasta [STATUS] : Parsing bam0 - /db/ana/programs/blobology/trinity_petrosia_complete_short_names.fasta.readsname.bowtie2.bam [STATUS] : Checking with 'samtools flagstat' Traceback (most recent call last): File "/usr/local/src/blobtools/blobtools-0.9.19.5/bloblib/map2cov.py", line 52, in
main() File "/usr/local/src/blobtools/blobtools-0.9.19.5/bloblib/map2cov.py", line 49, in main blobDb.parseCoverage(covLibObjs=cov_libs, no_base_cov=no_base_cov_flag) File "/usr/local/src/blobtools/blobtools-0.9.19.5/bloblib/BtCore.py", line 347, in parseCoverage base_cov_dict, covLib.reads_total, covLib.reads_mapped, read_cov_dict = BtIO.parseBam(covLib.f, set(self.dict_of_blobs), no_base_cov) File "/usr/local/src/blobtools/blobtools-0.9.19.5/bloblib/BtIO.py", line 384, in parseBam reads_total, reads_mapped = checkBam(infile) File "/usr/local/src/blobtools/blobtools-0.9.19.5/bloblib/BtIO.py", line 198, in checkBam reads_secondary = int(reads_secondary_re.search(output).group(1)) AttributeError: 'NoneType' object has no attribute 'group'
I created the bam file using this comman
bowtie2 -x trinity_petrosia_complete_short_names.fasta --very-fast-local -k 1 -t -p 12 --reorder --mm \ -U /db/ana/Paired-end/petrosia_PE_suff.fastq \ | samtools view -S -b -T trinity_petrosia_complete_short_names.fasta - >trinity_petrosia_complete_short_names.fasta.readsname.bowtie2.bam
I was wandering what are the numbers in the command above represent:
samtools view -f 1 -F 1024 -F 256 -F 2048
and if I can use the same numbers in my command?
Thanks for the help,
Maya
Hi Maya, this is most likely related to the version of samtools you are using (as Chris pointed out). BlobTools uses samtools flagstat to infer how many reads there are in order to display a progress bar during parsing. I just made a new release in which this problem should be fixed. Please download the newest release.
Regarding the numbers in the command:
samtools view -f 1 -F 1024 -F 256 -F 2048
these are samflags and you can get a feel for what they do on here or consulting the sam-format specification
What they mean in detail is : "Give me records of reads that are"
can use the same numbers in my command?
I am not sure how to answer this. Technically, you can. Philosophically, it depends of the question(s) you want to answer with your BAM file(s)... it is not required for BlobTools, if this is what you mean.
Hope that helps.
Can please you report back if the fix worked for you?
cheers,
dom
Hi Don,
I think I have that lastest vesrion. blobtools v0.9.19
when I changed my bam file to sam file it worked fine.
But I get the same error when I try the to filter the bam file using blobtools bamfilter \ -b /db/ana/programs/blobology/trinity_petrosia_complete_short_names.fasta.readsname.bowtie2.bam \ -i petrosia_tokeep2.txt
I would like to point out that I did not use paired end data, may that cause the problem?
cheers,
Maya
On Wed, Mar 29, 2017 at 8:13 PM, Dominik R Laetsch <notifications@github.com
wrote:
Hi Maya, this is most likely related to the version of samtools you are using (as Chris pointed out). BlobTools uses samtools flagstat to infer how many reads there are in order to display a progress bar during parsing. I just made a new release in which this problem should be fixed. Please download the newest release.
Regarding the numbers in the command:
samtools view -f 1 -F 1024 -F 256 -F 2048
these are samflags and you can get a feel for what they do on here https://broadinstitute.github.io/picard/explain-flags.html or consulting the sam-format specification https://samtools.github.io/hts-specs/SAMv1.pdf
What they mean in detail is : "Give me records of reads that are"
- paired (-f 1)
- NOT a PCR or optical duplicate (-F 1024)
- NOT a secondary alignment (-F 256)
- NOT a supplemetary alignment (-F 2048)
can use the same numbers in my command?
I am not sure how to answer this. Technically, you can. Philosophically, it depends of the question(s) you want to answer with your BAM file(s)... it is not required for BlobTools, if this is what you mean.
Hope that helps.
Can please you report back if the fix worked for you?
cheers,
dom
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/DRL/blobtools/issues/37#issuecomment-290195466, or mute the thread https://github.com/notifications/unsubscribe-auth/AMG9v5SyOhgOsNuQoIdg5fF2FOKjKxmCks5rqq1ngaJpZM4KPReO .
Hello,
I'm going through the tutorial on your site, but am running into issues. Below is the command that I'm trying and the errors that I'm getting. I'm using the new version 0.9.19.4.
Command from tutorial:
Errors:
Thanks for your help!