fog / fog

The Ruby cloud services library.
http://fog.github.io
MIT License
4.32k stars 1.47k forks source link

aws/models/storage/files breaks in JRuby with "service and directory are required for this operation" #2928

Closed alexanderdean closed 9 years ago

alexanderdean commented 10 years ago

I am the author of Sluice which uses fog to perform lots of file operations on S3.

It works fine in Ruby, but in JRuby file ops involving Arrays of Fog::Storage::AWS::Files are erroring out with "error: org.jruby.embed.EvalFailedException: (ArgumentError) service and directory are required for this operation".

Here are full steps to reproduce:

vagrant@precise64:/tmp$ mkdir sluice-test
vagrant@precise64:/tmp$ cd sluice-test/
vagrant@precise64:/tmp/sluice-test$ rvm use jruby
Using /home/vagrant/.rvm/gems/jruby-1.7.11
vagrant@precise64:/tmp/sluice-test$ jruby -S gem install sluice
Successfully installed sluice-0.2.0
1 gem installed

vagrant@precise64:/tmp/sluice-test$ irb
jruby-1.7.11 :001 > require 'sluice'
 => true

jruby-1.7.11 :002 > s3 = Sluice::Storage::S3::new_fog_s3_from(
jruby-1.7.11 :003 >       "eu-west-1",
jruby-1.7.11 :004 >       "xxx",
jruby-1.7.11 :005 >       "yyy")
 => #<Fog::Storage::AWS::Real:2150 @aws_credentials_expire_at=nil @connection=#
<Fog::XML::Connection:0x29aa48fe @excon=#<Excon::Connection:10d0 @socket_key="https://s3-eu-west-1.amazonaws.com:443" @data={:chunk_size=>1048576, :ciphers=>"HIGH:!SSLv2:!aNULL:!eNULL:!3DES", :connect_timeout=>60, :debug_request=>false, :debug_response=>true, :headers=>{"User-Agent"=>"fog/1.22.0"}, :idempotent=>false, :instrumentor_name=>"excon", :middlewares=>[Excon::Middleware::ResponseParser, Excon::Middleware::Expects, Excon::Middleware::Idempotent, Excon::Middleware::Instrumentor, Excon::Middleware::Mock], :mock=>false, :nonblock=>true, :omit_default_port=>false, :persistent=>false, :read_timeout=>60, :retry_limit=>4, :ssl_verify_peer=>true, :tcp_nodelay=>false, :uri_parser=>URI, :write_timeout=>60, :host=>"s3-eu-west-1.amazonaws.com", :path=>"", :port=>443, :query=>nil, :scheme=>"https", :user=>nil, :password=>nil}>> @host="s3-eu-west-1.amazonaws.com" @aws_access_key_id="xxx" @persistent=false @port=443 @use_iam_profile=nil @region="eu-west-1" @path_style=false @endpoint=nil @scheme="https" @connection_options={:debug_response=>true, :headers=>{"User-Agent"=>"fog/1.22.0"}, :persistent=>false} @aws_session_token=nil>

ruby-1.7.11 :006 > in_l = Sluice::Storage::S3::Location.new("s3n://test-bucket/from/")
 => #<Sluice::Storage::S3::Location:0xed92dbb @dir="from", @bucket="test-bucket", @s3_location="s3n://test-bucket/from/">

jruby-1.7.11 :007 > out_l = Sluice::Storage::S3::Location.new("s3n://test-bucket/to/")
 => #<Sluice::Storage::S3::Location:0x295687d9 @dir="to", @bucket="test-bucket", @s3_location="s3n://test-bucket/to/">

jruby-1.7.11 :008 > files_moved = Sluice::Storage::S3::move_files(s3, in_l, out_l, '.+', false, false)
  moving files from s3n://test-bucket/from/ to s3n://test-bucket/to/
ArgumentError: service and directory are required for this operation
ArgumentError: service and directory are required for this operation
       requires at /home/vagrant/.rvm/gems/jruby-1.7.11/gems/fog-core-1.22.0/lib/fog/core/attributes.rb:188
            all at /home/vagrant/.rvm/gems/jruby-1.7.11/gems/fog-1.22.0/lib/fog/aws/models/storage/files.rb:23
      lazy_load at /home/vagrant/.rvm/gems/jruby-1.7.11/gems/fog-core-1.22.0/lib/fog/core/collection.rb:139
           size at /home/vagrant/.rvm/gems/jruby-1.7.11/gems/fog-core-1.22.0/lib/fog/core/collection.rb:22
  process_files at /home/vagrant/.rvm/gems/jruby-1.7.11/gems/sluice-0.2.0/lib/sluice/storage/s3/s3.rb:466
    synchronize at org/jruby/ext/thread/Mutex.java:149
  process_files at /home/vagrant/.rvm/gems/jruby-1.7.11/gems/sluice-0.2.0/lib/sluice/storage/s3/s3.rb:437
           loop at org/jruby/RubyKernel.java:1521
  process_files at /home/vagrant/.rvm/gems/jruby-1.7.11/gems/sluice-0.2.0/lib/sluice/storage/s3/s3.rb:428
ArgumentError: service and directory are required for this operation
       requires at /home/vagrant/.rvm/gems/jruby-1.7.11/gems/fog-core-1.22.0/lib/fog/core/attributes.rb:188
            all at /home/vagrant/.rvm/gems/jruby-1.7.11/gems/fog-1.22.0/lib/fog/aws/models/storage/files.rb:23
      lazy_load at /home/vagrant/.rvm/gems/jruby-1.7.11/gems/fog-core-1.22.0/lib/fog/core/collection.rb:139
           size at /home/vagrant/.rvm/gems/jruby-1.7.11/gems/fog-core-1.22.0/lib/fog/core/collection.rb:22
  process_files at /home/vagrant/.rvm/gems/jruby-1.7.11/gems/sluice-0.2.0/lib/sluice/storage/s3/s3.rb:466
    synchronize at org/jruby/ext/thread/Mutex.java:149
  process_files at /home/vagrant/.rvm/gems/jruby-1.7.11/gems/sluice-0.2.0/lib/sluice/storage/s3/s3.rb:437
           loop at org/jruby/RubyKernel.java:1521
  process_files at /home/vagrant/.rvm/gems/jruby-1.7.11/gems/sluice-0.2.0/lib/sluice/storage/s3/s3.rb:428

The exact same code works fine in Ruby.

Any idea why JRuby would be struggling with Arrays of Fog Files in a way that vanilla Ruby does not?

alexanderdean commented 10 years ago

Update, I've managed to reproduce the error without Sluice involved:

jruby-1.7.11 :008 > files_to_process = []
 => []
jruby-1.7.11 :009 > files_to_process.size
 => 0
jruby-1.7.11 :011 > files_to_process = my_fog.directories.get(in_l.bucket, :prefix => in_l.dir).files.all({})
 =>   <Fog::Storage::AWS::Files
<snip>
jruby-1.7.11 :013 > file = files_to_process.pop
 =>   <Fog::Storage::AWS::File
<snip>
jruby-1.7.11 :025 > files_to_process.size
 => 5
jruby-1.7.11 :026 > files_to_process = files_to_process.reverse
ArgumentError: service and directory are required for this operation
    from /home/vagrant/.rvm/gems/jruby-1.7.11/gems/fog-core-1.22.0/lib/fog/core/attributes.rb:188:in `requires'
    from /home/vagrant/.rvm/gems/jruby-1.7.11/gems/fog-1.22.0/lib/fog/aws/models/storage/files.rb:23:in `all'
    from /home/vagrant/.rvm/gems/jruby-1.7.11/gems/fog-core-1.22.0/lib/fog/core/collection.rb:139:in `lazy_load'
    from /home/vagrant/.rvm/gems/jruby-1.7.11/gems/fog-core-1.22.0/lib/fog/core/collection.rb:22:in `empty?'
    from /home/vagrant/.rvm/gems/jruby-1.7.11/gems/fog-core-1.22.0/lib/fog/core/collection.rb:84:in `inspect'
    from /home/vagrant/.rvm/gems/jruby-1.7.11/gems/formatador-0.2.4/lib/formatador.rb:92:in `indent'
    from /home/vagrant/.rvm/gems/jruby-1.7.11/gems/fog-core-1.22.0/lib/fog/core/collection.rb:77:in `inspect'
    from org/jruby/RubyProc.java:271:in `call'
    from org/jruby/RubyKernel.java:1521:in `loop'
    from org/jruby/RubyKernel.java:1284:in `catch'
    from org/jruby/RubyKernel.java:1284:in `catch'
    from /home/vagrant/.rvm/rubies/jruby-1.7.11/bin/irb:13:in `(root)'
alexanderdean commented 10 years ago

Update 2, this works:

jruby-1.7.11 :030 > files_to_process = files_to_process.to_a.reverse
 => [          <Fog::Storage::AWS::File
<snip>

My conclusion is that there's something in the abstraction from a Ruby Array into a Fog Files via a Fog Collection which is leaky, and the leak is only evident in JRuby.

geemus commented 10 years ago

Is this evident in other operations or just reverse? It seems to be an odd interaction with whether or not @loaded (and other instance variables, for that matter) are loaded. Are you doing operations across threads or otherwise where this state might be off?

alexanderdean commented 10 years ago

Originally I was doing a lot inside of threads, but then I was able to reproduce it just in JRuby irb (in Update 1). All a bit odd

geemus commented 10 years ago

Indeed. It must be interacting strangely with the lazy loading stuff, but then trying to fetch the data in a context in which the credentials and connection are no longer in scope. Haven't heard other's run in to this. Since you see it in irb, perhaps it is a broader JRuby related issue. How isolated of an example can you get to produce the behavior in irb?

plribeiro3000 commented 9 years ago

Closing due inactivity. Please create a new issue at fog-aws if this stills need to be tackled.

Thanks!