Closed spheregenomics closed 10 years ago
Question opened on Stack Overflow http://stackoverflow.com/questions/22647826/ruby-memory-issues-with-kernel-open
Please could you try "jruby -J-Xmx3g your_script.rb" to keep 3G byte heap in Java virtual machine? Default heap size may not enough to load a 2-bit file. (see also the Implementation section of README.md).
Thank you for your comment. I am running MRI ruby, not JRuby. I can try to install JRuby but this is a production machine and I don't want to make too many changes. It worked fine for the pat 9 months...
I am sorry for my misunderstanding. I saw discussion on StackOverflow. I don't know why I used Kernel.open like the following code:
two_bit = nil
Kernel.open(filename, 'rb') {|f| two_bit = f.read}
should have written like
two_bit = File.read(filename)
Could you try the following code on irb? irb > File.open('/home/assay/apps/assay/shared/bin/hg19/hg19.2bit'); nil ("; nil" is necessary to avoid printing whole the 2bit file)
If this works, I will update gem.
I tried the command out in rails console... is that acceptable?
Loading production environment (Rails 4.0.2)
1.9.3p327 :001 > include Bio
=> Object
1.9.3p327 :002 > File.open('/home/assay/apps/assay/shared/bin/hg19/hg19.2bit'); nil
=> nil
Acceptable! Probably, Kernel.open has some troubles in some situations. I will update Ruby-UCSC-API gem sooner.
In previous version, I used Kernel.open because of name collision (Bio::Ucsc::File
and File
on the top level). I did not know an expression "::File.open" to indicate absolute path.
Now, the final code is:
two_bit = nil
::File.open(filename, 'rb') {|f| two_bit = f.read}
I used block to close the file immediately,
[CORRECTION] You do not have to put "; nil" in irb. This problem was fixed in the version 0.5.0.
I released v0.6.2 on RubyGems. Hopefully, it will fix all the problems.
Thank your for your efforts. I tried testing again in Rails console, and received a new error. There is 6GB RAM free on the server.
Loading production environment (Rails 4.0.2)
1.9.3p327 :001 > include Bio
=> Object
1.9.3p327 :002 > two_bit = nil
=> nil
1.9.3p327 :003 > ::File.open('/home/assay/apps/assay/shared/bin/hg19/hg19.2bit', 'rb') {|f| two_bit = f.read}
NoMemoryError: failed to allocate memory
from (irb):3:in `read'
from (irb):3:in `block in irb_binding'
from (irb):3:in `open'
from (irb):3
from /home/assay/apps/assay/shared/bundle/ruby/1.9.1/gems/railties-4.0.2/lib/rails/commands/console.rb:90:in `start'
from /home/assay/apps/assay/shared/bundle/ruby/1.9.1/gems/railties-4.0.2/lib/rails/commands/console.rb:9:in `start'
from /home/assay/apps/assay/shared/bundle/ruby/1.9.1/gems/railties-4.0.2/lib/rails/commands.rb:62:in `<top (required)>'
from script/rails:6:in `require'
from script/rails:6:in `<main>'
Thank you for your new comment.
So far, I do not have good solution. In my Linux box with 8Gbyte RAM, the status of the 'spec/file/twobit.rb' spec is 'passed' using ruby-1.9.3-p125 and ruby-2.1.0. This spec uses UCSC's hg18.2bit and hg19.2bit.
I am watching the discussion at Stack Overflow and trying to find other web resources related to this issue.
Hi misshie, I suspect there is some other problem in play here. I note that the code works fine on my laptop, and also prior to a Rails 4 upgrade worked fine for 9 months on the Ubuntu 12 production server. I think this gem is an excellent tool and will try to continue to work with it. One option is to build a clone of the server to determine what is causing this memory error.
I will first try updating ruby and then try a new server build. I will report my findings back to this page and to Stack Overflow. I did note the gem is reliant on the mysql gem, which apparently has been the cause of memory problems. I will also experiment with mysql2 and ruby-mysql. Regards, Sean
I have managed to work around this issue by wrapping the bio-ucsc-api library in an external ruby program and running via a Open4 call. It works fine now. When run from within a rails stack it dies.
Workaround code below:
require 'bio-ucsc'
include Bio
# ARGV[0] is the full path to the 2bit file
# expects ARGV[1] to be this:
# subsequence = "#{chrom}:#{batch_detail.chrom_start - batch_detail.forward_offset}-#{batch_detail.chrom_end + batch_detail.reverse_offset}"
seqfile = Ucsc::File::Twobit.open("#{ARGV[0]}")
extracted_seq = seqfile.subseq("#{ARGV[1]}")
puts extracted_seq
This code is called within the Rails app by this:
def self.ucscapi(subsequence)
filepath = "#{Rails.root}/bin/hg19/hg19.2bit"
scriptdir = "#{Rails.root}/bin/ucscapi"
Dir.chdir(scriptdir)
cmd = "ruby ucscapi.rb #{filepath} #{subsequence}"
sequence = ''
ucsc_status = Open4::popen4("sh") do |pid, stdin, stdout, stderr|
stdin.puts "cd #{scriptdir}"
stdin.puts "#{cmd}"
stdin.close
sequence = stdout.read.strip
end
return sequence
end
I have been using this gem for some time with no problems. Suddenly it cannot allocate memory. I tried upgrading to 0.6.1 but it did not make a difference.
Rails 4.0.2 ruby 1.9.3p327 (2012-11-10 revision 37606) [i686-linux]