appoxy / aws

Amazon Web Services (AWS) Ruby Gem
https://rubygems.org/gems/aws
236 stars 87 forks source link

deal with HEAD requests on objects which fail when a redirect is returned #113

Closed marios closed 12 years ago

marios commented 12 years ago

Hi Travis,

please consider this minor addition - it is related/builds on my previous pull request 110 ... so the scenario is the same as in you have a bucket in a particular location (e.g. ap-south-east-1 ) and you are trying to GET that bucket whilst talking to a different S3 endpoint - like eu-west-1 for example. Pull request 110 dealt with GET /bucket - aws responds with a PermanentRedirect which the client can parse and make a new request to the correct endpoint.

However, in the case of HEAD /bucket/object (i.e. get details about an object but not the object data itself)... you don't get the XML body containing the correct endpoint - since this is a HEAD request. AWS also do not set the 'location' header so there is no way to identify the correct endpoint. I posted a question on the AWS forum but haven't heard anything so far... https://forums.aws.amazon.com/thread.jspa?messageID=340398

Now, to GET the object, you have to first GET the bucket:

    client = Aws::S3.new("KEY", "SECRET", {:server=>"s3-ap-southeast-1.amazonaws.com", :connection_mode=>:per_thread})
    bucket = client.bucket("marios-test-bucket.foo")
    blob = s3_bucket.key(opts[:id], true)

The best solution I could come up with is to save the endpoint after a successful GET on the bucket, and use that endpoint when doing the HEAD against the object.

many thanks for considering this request, marios

treeder commented 12 years ago

hi @marios , I don't get why you would be talking to a different s3 endpoint, wouldn't you know before hand which region your bucket is in?

marios commented 12 years ago

Yes, you do. I completely missed the ability to query for 'location' in the S3 API @ http://docs.amazonwebservices.com/AmazonS3/latest/API/RESTBucketGETlocation.html. So I guess once I GET the bucket itself, I can do bucket.location, determine the endpoint and re-initialise the AWS::S3 client and make the request. As always, thanks for your time and apologies for the noise - if you are interested in why I came to this point see the examples below. Please go ahead and close this request.

Create a bucket in EU:

    eu_client = Aws::S3.new("KEY", "PASS", {:server=>"s3-eu-west-1.amazonaws.com", :connection_mode=>:per_thread})
    eu_bucket = eu_client.bucket("marios-eu-bucket-2", true, nil, {:location=>"eu"})

Retrieve the EU bucket and list keys, using another endpoint:

    client = Aws::S3.new("KEY", "PASS", {:server=>"s3-us-west-1.amazonaws.com", :connection_mode=>:per_thread})
    bucket = client.bucket("marios-eu-bucket-2")
    bucket.keys
PROBLEM: (BUT, this is what was fixed in https://github.com/appoxy/aws/pull/110):
    irb(main):040:0> bucket.keys
    W, [2012-05-24T17:13:05.603882 #3908]  WARN -- : Rightscale::HttpConnection : request failure count: 1, exception: #<Errno::EPIPE: Broken pipe>
    I, [2012-05-24T17:13:05.604102 #3908]  INFO -- : Opening new HTTPS connection to marios-eu-bucket-2.s3-us-west-1.amazonaws.com:443
    I, [2012-05-24T17:13:07.129148 #3908]  INFO -- : ##### Aws::S3Interface redirect requested: 301 Moved Permanently #####
    I, [2012-05-24T17:13:07.129230 #3908]  INFO -- : ##### New location:  #####
    E, [2012-05-24T17:13:07.129363 #3908] ERROR -- : #<URI::InvalidURIError: bad URI(is not URI?): >
    /usr/lib/ruby/1.8/uri/common.rb:436:in `split'/usr/lib/ruby/1.8/uri/common.rb:485:in `parse'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/ses/../awsbase/errors.rb:222:in `check'/usr/lib/ruby/1.8/benchmark.rb:293:in `measure'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/awsbase/benchmark_fix.rb:30:in `add!'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/ses/../awsbase/errors.rb:209:in `check'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/ses/../awsbase/awsbase.rb:566:in `request_info_impl'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/ses/../awsbase/awsbase.rb:316:in `request_info2'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/ses/../awsbase/awsbase.rb:338:in `request_info3'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/s3/s3_interface.rb:182:in `request_info'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/s3/s3_interface.rb:345:in `incrementally_list_bucket'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/s3/bucket.rb:124:in `keys_and_service'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/s3/bucket.rb:109:in `keys'(irb):40:in `irb_binding'/usr/lib/ruby/1.8/irb/workspace.rb:52:in `irb_binding'/usr/lib/ruby/1.8/irb/workspace.rb:52.join('
    ')}
    E, [2012-05-24T17:13:07.129424 #3908] ERROR -- : Request was:  /
    E, [2012-05-24T17:13:07.129467 #3908] ERROR -- : Response was: 301 -- Moved Permanently -- <?xml version="1.0" encoding="UTF-8"?>
    <Error><Code>PermanentRedirect</Code><Message>The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint.</Message><RequestId>938D78A1BF3F107C</RequestId><Bucket>marios-eu-bucket-2</Bucket><HostId>DxugjFF5zHWF+3DGgi1UEEmhk6Q360PslvBj6LiulQT3SM+cOZ5EjOpIeuODoGHF</HostId><Endpoint>marios-eu-bucket-2.s3-external-3.amazonaws.com</Endpoint></Error>
    URI::InvalidURIError: bad URI(is not URI?): 
            from /usr/lib/ruby/1.8/uri/common.rb:436:in `split'
            from /usr/lib/ruby/1.8/uri/common.rb:485:in `parse'
            from /usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/ses/../awsbase/errors.rb:222:in `check'
            from /usr/lib/ruby/1.8/benchmark.rb:293:in `measure'
            from /usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/awsbase/benchmark_fix.rb:30:in `add!'
            from /usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/ses/../awsbase/errors.rb:209:in `check'
            from /usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/ses/../awsbase/awsbase.rb:566:in `request_info_impl'
            from /usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/ses/../awsbase/awsbase.rb:316:in `request_info2'
            from /usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/ses/../awsbase/awsbase.rb:338:in `request_info3'
            from /usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/s3/s3_interface.rb:182:in `request_info'
            from /usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/s3/s3_interface.rb:345:in `incrementally_list_bucket'
            from /usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/s3/bucket.rb:124:in `keys_and_service'
            from /usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/s3/bucket.rb:109:in `keys'
            from (irb):40

After applying the fix from pull request 110, retrieve the EU bucket and GET a specific key, using another endpoint (even default):

    client = Aws::S3.new("KEY", "PASS", {:server=>"s3.amazonaws.com", :connection_mode=>:per_thread})
    bucket = client.bucket("marios-eu-test-bucket")
    blob = bucket.key("marios-eu-test-blob", true)
PROBLEM - rexml blows up as it tries to parse body - which is empty because this was a HEAD request.

Solution is to check the bucket.location and re-initialize the client accordingly. If my solution attached here was any more elegant I'd insist more on this 'fix' but since it's so hackish I don't think it's a good idea to include it. The 'fix' basically pushes the 'reinitialization' of the client endpoint to the aws gem rather than whatever is using the gem.

    (TRACE for last example):
    I, [2012-05-24T16:25:27.839806 #3741]  INFO -- : Closing HTTPS connection to s3.amazonaws.com:443
    I, [2012-05-24T16:25:27.840114 #3741]  INFO -- : Opening new HTTPS connection to marios-eu-test-bucket.s3.amazonaws.com:443
    I, [2012-05-24T16:25:29.924931 #3741]  INFO -- : ##### Aws::S3Interface redirect requested: 307 Temporary Redirect #####
    I, [2012-05-24T16:25:29.925157 #3741]  INFO -- : ##### New location: https://marios-eu-test-bucket.s3-external-3.amazonaws.com/?prefix=marios-eu-test-blob #####
    I, [2012-05-24T16:25:29.925988 #3741]  INFO -- : ##### Retry #1 is being performed due to a redirect.  ####
    I, [2012-05-24T16:25:29.926519 #3741]  INFO -- : Closing HTTPS connection to marios-eu-test-bucket.s3.amazonaws.com:443
    I, [2012-05-24T16:25:29.926759 #3741]  INFO -- : Opening new HTTPS connection to marios-eu-test-bucket.s3-external-3.amazonaws.com:443
    I, [2012-05-24T16:25:31.839903 #3741]  INFO -- : Closing HTTPS connection to marios-eu-test-bucket.s3-external-3.amazonaws.com:443
    I, [2012-05-24T16:25:31.840217 #3741]  INFO -- : Opening new HTTPS connection to marios-eu-test-bucket.s3.amazonaws.com:443
    E, [2012-05-24T16:25:32.985431 #3741] ERROR -- : #<RuntimeError: NilClass is not a valid input stream.  It must walk 
    like either a String, an IO, or a Source.>
    /usr/lib/ruby/1.8/rexml/source.rb:21:in `create_from'/usr/lib/ruby/1.8/rexml/parsers/baseparser.rb:133:in `stream='/usr/lib/ruby/1.8/rexml/parsers/baseparser.rb:110:in `initialize'/usr/lib/ruby/1.8/rexml/parsers/streamparser.rb:6:in `new'/usr/lib/ruby/1.8/rexml/parsers/streamparser.rb:6:in `initialize'/usr/lib/ruby/1.8/rexml/document.rb:201:in `new'/usr/lib/ruby/1.8/rexml/document.rb:201:in `parse_stream'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/ses/../awsbase/parsers.rb:136:in `parse'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/ses/../awsbase/parsers.rb:188:in `parse'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/ses/../awsbase/errors.rb:211:in `check'/usr/lib/ruby/1.8/benchmark.rb:293:in `measure'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/awsbase/benchmark_fix.rb:30:in `add!'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/ses/../awsbase/errors.rb:209:in `check'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/ses/../awsbase/awsbase.rb:566:in `request_info_impl'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/ses/../awsbase/awsbase.rb:316:in `request_info2'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/ses/../awsbase/awsbase.rb:338:in `request_info3'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/s3/s3_interface.rb:182:in `request_info'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/s3/s3_interface.rb:636:in `head'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/s3/key.rb:228:in `head'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/s3/bucket.rb:129:in `keys_and_service'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/s3/bucket.rb:126:in `each'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/s3/bucket.rb:126:in `keys_and_service'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/s3/s3_interface.rb:352:in `incrementally_list_bucket'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/s3/bucket.rb:124:in `keys_and_service'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/s3/bucket.rb:109:in `keys'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/s3/bucket.rb:153:in `key'(irb):5:in `irb_binding'/usr/lib/ruby/1.8/irb/workspace.rb:52:in `irb_binding':0.join('
    ')}
    E, [2012-05-24T16:25:32.985676 #3741] ERROR -- : Request was:  /marios-eu-test-blob
    E, [2012-05-24T16:25:32.985799 #3741] ERROR -- : Response was: 307 -- Temporary Redirect -- 
    E, [2012-05-24T16:25:32.985999 #3741] ERROR -- : #<RuntimeError: NilClass is not a valid input stream.  It must walk 
    like either a String, an IO, or a Source.>
    /usr/lib/ruby/1.8/rexml/source.rb:21:in `create_from'/usr/lib/ruby/1.8/rexml/parsers/baseparser.rb:133:in `stream='/usr/lib/ruby/1.8/rexml/parsers/baseparser.rb:110:in `initialize'/usr/lib/ruby/1.8/rexml/parsers/streamparser.rb:6:in `new'/usr/lib/ruby/1.8/rexml/parsers/streamparser.rb:6:in `initialize'/usr/lib/ruby/1.8/rexml/document.rb:201:in `new'/usr/lib/ruby/1.8/rexml/document.rb:201:in `parse_stream'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/ses/../awsbase/parsers.rb:136:in `parse'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/ses/../awsbase/parsers.rb:188:in `parse'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/ses/../awsbase/errors.rb:211:in `check'/usr/lib/ruby/1.8/benchmark.rb:293:in `measure'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/awsbase/benchmark_fix.rb:30:in `add!'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/ses/../awsbase/errors.rb:209:in `check'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/ses/../awsbase/awsbase.rb:566:in `request_info_impl'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/ses/../awsbase/awsbase.rb:316:in `request_info2'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/ses/../awsbase/awsbase.rb:338:in `request_info3'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/s3/s3_interface.rb:182:in `request_info'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/s3/s3_interface.rb:636:in `head'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/s3/key.rb:228:in `head'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/s3/bucket.rb:129:in `keys_and_service'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/s3/bucket.rb:126:in `each'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/s3/bucket.rb:126:in `keys_and_service'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/s3/s3_interface.rb:352:in `incrementally_list_bucket'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/s3/bucket.rb:124:in `keys_and_service'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/s3/bucket.rb:109:in `keys'/usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/s3/bucket.rb:153:in `key'(irb):5:in `irb_binding'/usr/lib/ruby/1.8/irb/workspace.rb:52:in `irb_binding':0.join('
    ')}
    E, [2012-05-24T16:25:32.986167 #3741] ERROR -- : Request was:  /marios-eu-test-blob
    E, [2012-05-24T16:25:32.986287 #3741] ERROR -- : Response was: 307 -- Temporary Redirect -- 
    RuntimeError: NilClass is not a valid input stream.  It must walk 
    like either a String, an IO, or a Source.
            from /usr/lib/ruby/1.8/rexml/source.rb:21:in `create_from'
            from /usr/lib/ruby/1.8/rexml/parsers/baseparser.rb:133:in `stream='
            from /usr/lib/ruby/1.8/rexml/parsers/baseparser.rb:110:in `initialize'
            from /usr/lib/ruby/1.8/rexml/parsers/streamparser.rb:6:in `new'
            from /usr/lib/ruby/1.8/rexml/parsers/streamparser.rb:6:in `initialize'
            from /usr/lib/ruby/1.8/rexml/document.rb:201:in `new'
            from /usr/lib/ruby/1.8/rexml/document.rb:201:in `parse_stream'
            from /usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/ses/../awsbase/parsers.rb:136:in `parse'
            from /usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/ses/../awsbase/parsers.rb:188:in `parse'
            from /usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/ses/../awsbase/errors.rb:211:in `check'
            from /usr/lib/ruby/1.8/benchmark.rb:293:in `measure'
            from /usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/awsbase/benchmark_fix.rb:30:in `add!'
            from /usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/ses/../awsbase/errors.rb:209:in `check'
            from /usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/ses/../awsbase/awsbase.rb:566:in `request_info_impl'
            from /usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/ses/../awsbase/awsbase.rb:316:in `request_info2'
            from /usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/ses/../awsbase/awsbase.rb:338:in `request_info3'
            from /usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/s3/s3_interface.rb:182:in `request_info'
            from /usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/s3/s3_interface.rb:636:in `head'
            from /usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/s3/key.rb:228:in `head'
            from /usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/s3/bucket.rb:129:in `keys_and_service'
            from /usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/s3/bucket.rb:126:in `each'
            from /usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/s3/bucket.rb:126:in `keys_and_service'
            from /usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/s3/s3_interface.rb:352:in `incrementally_list_bucket'
            from /usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/s3/bucket.rb:124:in `keys_and_service'
            from /usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/s3/bucket.rb:109:in `keys'
            from /usr/lib/ruby/gems/1.8/gems/aws-2.5.6/lib/s3/bucket.rb:153:in `key'
            from (irb):5
            from :0irb(main):006:0>