Closed MatthewRalston closed 8 years ago
Hi Matt,
Im not sure I can replicate the problem. I am getting back the Bio::DB::Alignment
objects I expect. Can you check the contents of reads
and let me know what they are. Are you saying that fetch just spits out a file? I don't understand what you're expecting to see in reads
$ gem list | grep bio-samtools bio-samtools (2.3.3, 2.2.0) $ irb 2.1.5 :001 > require 'bio-samtools' => true 2.1.5 :002 > bam=Bio::DB::Sam.new(:bam => '/Users/macleand/.rvm/gems/ruby-2.1.5/gems/bio-samtools-2.3.3/test/samples/small/sorted.bam', :fasta => '/Users/macleand/.rvm/gems/ruby-2.1.5/gems/bio-samtools-2.3.3/test/samples/small/test_chr.fasta') => #<Bio::DB::Sam:0x007fad3aa729b8 @fasta="/Users/macleand/.rvm/gems/ruby-2.1.5/gems/bio-samtools-2.3.3/test/samples/small/test_chr.fasta", @bam="/Users/macleand/.rvm/gems/ruby-2.1.5/gems/bio-samtools-2.3.3/test/samples/small/sorted.bam", @samtools="/Users/macleand/.rvm/gems/ruby-2.1.5/gems/bio-samtools-2.3.3/lib/bio/db/sam/external/samtools", @bcftools="/Users/macleand/.rvm/gems/ruby-2.1.5/gems/bio-samtools-2.3.3/lib/bio/db/sam/external/bcftools", @last_command=nil> 2.1.5 :003 > reads = [] => [] 2.1.5 :004 > bam.fetch("chr_1", 100, 150) {|x| reads << x} => #<Process::Status: pid 5687 exit 0> 2.1.5 :005 > reads.length => 2 2.1.5 :006 > puts reads
Bio::DB::Alignment:0x007fad3b07a8b0
Bio::DB::Alignment:0x007fad3b079ed8
=> nil 2.1.5 :007 >
I expect to see 7 reads/Alignment objects, like the samtools view
command with the same arguments produces above. Instead, all the reads in the file are passed to the block. I don't observe the same problem in 2.3.2:
~/sandbox >gem list | grep bio-samtools
bio-samtools (2.3.2)
~/sandbox >irb
2.1.2 :001 > require 'bio-samtools'
=> true
2.1.2 :002 > reads=[]
=> []
2.1.2 :003 > bam=Bio::DB::Sam.new(:bam => "spec/test_files/test.bam", :fasta => "spec/test_files/test.fa")
=> #<Bio::DB::Sam:0x007f80910e6e88 @fasta="spec/test_files/test.fa", @bam="spec/test_files/test.bam", @samtools="/Users/Matthew/.rvm/gems/ruby-2.1.2@SCI/gems/bio-samtools-2.3.2/lib/bio/db/sam/external/samtools", @bcftools="/Users/Matthew/.rvm/gems/ruby-2.1.2@SCI/gems/bio-samtools-2.3.2/lib/bio/db/sam/external/bcftools", @last_command=nil>
2.1.2 :004 > bam.fetch("NC_001988.2",75,75) {|x| reads << x}
=> #<Process::Status: pid 95802 exit 0>
2.1.2 :005 > reads.size
=> 7
But when I uninstall 2.3.2 and switch to 2.3.3 i see the following:
~/sandbox >gem uninstall bio-samtools
Remove executables:
bam_consensus.rb
in addition to the gem? [Yn] y
Removing bam_consensus.rb
Successfully uninstalled bio-samtools-2.3.2
~/sandbox >gem install bio-samtools
Fetching: bio-samtools-2.3.3.gem (100%)
Building native extensions. This could take a while...
Successfully installed bio-samtools-2.3.3
1 gem installed
~/sandbox >irb
2.1.2 :001 > reads=[]
=> []
2.1.2 :002 > require 'bio-samtools'
=> true
2.1.2 :003 > bam=Bio::DB::Sam.new(:bam => "spec/test_files/test.bam", :fasta => "spec/test_files/test.fa")
=> #<Bio::DB::Sam:0x007fad42a11750 @fasta="spec/test_files/test.fa", @bam="spec/test_files/test.bam", @samtools="/Users/Matthew/.rvm/gems/ruby-2.1.2@test/gems/bio-samtools-2.3.3/lib/bio/db/sam/external/samtools", @bcftools="/Users/Matthew/.rvm/gems/ruby-2.1.2@test/gems/bio-samtools-2.3.3/lib/bio/db/sam/external/bcftools", @last_command=nil>
2.1.2 :004 > bam.fetch("NC_001988.2",75,75) {|x| reads << x}
=> #<Process::Status: pid 2067 exit 0>
2.1.2 :005 > reads.size
=> 36
2.1.2 :006 > puts reads
#<Bio::DB::Alignment:0x007fad4403f9a0>
#<Bio::DB::Alignment:0x007fad4403eed8>
#<Bio::DB::Alignment:0x007fad4403eaa0>
#<Bio::DB::Alignment:0x007fad4403dee8>
#<Bio::DB::Alignment:0x007fad4403d330>
#<Bio::DB::Alignment:0x007fad4403c778>
#<Bio::DB::Alignment:0x007fad4404fb70>
#<Bio::DB::Alignment:0x007fad4404efb8>
#<Bio::DB::Alignment:0x007fad4404e400>
#<Bio::DB::Alignment:0x007fad4404d848>
#<Bio::DB::Alignment:0x007fad4404cc90>
#<Bio::DB::Alignment:0x007fad4404c0d8>
#<Bio::DB::Alignment:0x007fad4210f4e0>
#<Bio::DB::Alignment:0x007fad4210e928>
#<Bio::DB::Alignment:0x007fad4210dd70>
#<Bio::DB::Alignment:0x007fad4210d1b8>
#<Bio::DB::Alignment:0x007fad4210c6f0>
#<Bio::DB::Alignment:0x007fad42117af0>
#<Bio::DB::Alignment:0x007fad42116f38>
#<Bio::DB::Alignment:0x007fad42116380>
#<Bio::DB::Alignment:0x007fad421157c8>
#<Bio::DB::Alignment:0x007fad42114c10>
#<Bio::DB::Alignment:0x007fad42114148>
#<Bio::DB::Alignment:0x007fad44057550>
#<Bio::DB::Alignment:0x007fad44056a88>
#<Bio::DB::Alignment:0x007fad44055ed0>
#<Bio::DB::Alignment:0x007fad44055318>
#<Bio::DB::Alignment:0x007fad44054760>
#<Bio::DB::Alignment:0x007fad4211fc50>
#<Bio::DB::Alignment:0x007fad4211f188>
#<Bio::DB::Alignment:0x007fad4211e6c0>
#<Bio::DB::Alignment:0x007fad4211db08>
#<Bio::DB::Alignment:0x007fad4211cf50>
#<Bio::DB::Alignment:0x007fad4211c398>
#<Bio::DB::Alignment:0x007fad4405f778>
#<Bio::DB::Alignment:0x007fad4405ebc0>
It's worth noting that none of the 36 items in the list (the total number of reads in the file) are of type NilClass.
Can you tell us which version of samtools you have in your system? We compile the version 0.1.19, and the results may vary according to the default options in samtools. The library should be consistent, as it downloads 0.1.19 (we haven't updated to 1.x because some functionality is still missing in the new library). I had a look at the code in the unit test, and it expect 36 reads, I'm now wondering why it was displaying 7 before. So, it could be some of the default quality filters (I think it is the option -A which displays all the reads). But I need to investigate more.
I'll have to get back to you when I'm home from work to provide additional specifics. @homonecloco you're not using the samtools version I have in my PATH variable, correct? You're likely using the one installed with the library which would be 0.1.19. 7 reads is the correct number as the samtools output at the top shows... The other 29 reads do not overlap with base 75 in the reference sequence I'm working with and thus 36 alignment objects is not correct. Is the correct output supposed to be 29 NilClass objects and 7 Alignment class objects? Or should it be the 7 reads that the 2.3.2 call produces? The test bam file from my gem's specs is linked at the top of this thread; the same folder has the fasta file and the index.
Hi @MatthewRalston , just added a unit test (not released as in the gem yet) and I don't seem to be able to reproduce the error. I agree with you that you should have 7 reads overlapping in the position (I'm sorry for my previous message, I wasn't in the computer and was trying to get the error just from reading the code). The correct output, in this case would be 7 alignment objects, with starting positions [7,10,72,72,72,72,75]. Best, Ricardo.
Thanks @homonecloco I'll see if I can get your tests to pass in a fresh gemset when I get home :)
When running the fetch method or view method, the samtools command is malformed, returning all the reads in a file.