fog / fog-aws

Module for the 'fog' gem to support Amazon Web Services http://aws.amazon.com/
MIT License
296 stars 353 forks source link

Fog::AWS::Storage don't retry client errors #691

Closed rahim closed 9 months ago

rahim commented 9 months ago

Resolves #690

Excon's default configuration retries HTTP 4xx class errors, which we generally can't expect to be resolved by retry. Combined with relatively recent retry config added in https://github.com/fog/fog-aws/issues/674 we were seeing S3 calls to non-existent objects take multiple seconds when made via fog-aws.

See the issue for detailed explanation of the underlying cause.

Though this resolves the problem, I question whether it's the right approach:

rahim commented 9 months ago

we create something of a foot gun, because someone who chooses to just configure eg retry_limit (or any other arbitrary connection option) will inadvertently set retry_errors back to Excon's default.

Across fog-aws we have the pattern @connection_options = options[:connection_options] || {}

lib/fog/aws/efs.rb:75:          @connection_options = options[:connection_options] || {}
lib/fog/aws/elb.rb:133:          @connection_options     = options[:connection_options] || {}
lib/fog/aws/beanstalk.rb:68:          @connection_options = options[:connection_options] || {}
lib/fog/aws/cdn.rb:148:          @connection_options = options[:connection_options] || {}
lib/fog/aws/cloud_watch.rb:90:          @connection_options = options[:connection_options] || {}
lib/fog/aws/sts.rb:81:          @connection_options     = options[:connection_options] || {}
lib/fog/aws/redshift.rb:81:          @connection_options     = options[:connection_options] || {}
lib/fog/aws/cloud_formation.rb:66:          @connection_options = options[:connection_options] || {}
lib/fog/aws/support.rb:90:          @connection_options = options[:connection_options] || {}
lib/fog/aws/data_pipeline.rb:107:          @connection_options     = options[:connection_options] || {}
lib/fog/aws/dns.rb:100:          @connection_options     = options[:connection_options] || {}
lib/fog/aws/ses.rb:55:          @connection_options     = options[:connection_options] || {}
lib/fog/aws/emr.rb:63:          @connection_options     = options[:connection_options] || {}
lib/fog/aws/glacier.rb:187:          @connection_options     = options[:connection_options] || {}
lib/fog/aws/lambda.rb:85:          @connection_options = options[:connection_options] || {}
lib/fog/aws/auto_scaling.rb:96:          @connection_options = options[:connection_options] || {}
lib/fog/aws/kinesis.rb:39:          @connection_options = options[:connection_options] || {}
lib/fog/aws/kms.rb:94:          @connection_options = options[:connection_options] || {}
lib/fog/aws/storage.rb:549:          @connection_options     = options[:connection_options] || { retry_limit: 5, retry_interval: 1 }
lib/fog/aws/iam.rb:264:          @connection_options = options[:connection_options] || {}
lib/fog/aws/sns.rb:83:          @connection_options     = options[:connection_options] || {}
lib/fog/aws/sqs.rb:81:          @connection_options     = options[:connection_options] || {}
lib/fog/aws/ecs.rb:62:          @connection_options     = options[:connection_options] || {}
lib/fog/aws/dynamodb.rb:80:          @connection_options     = options[:connection_options] || {}
lib/fog/aws/rds.rb:217:          @connection_options     = options[:connection_options] || {}
lib/fog/aws/federation.rb:39:          @connection_options = options[:connection_options] || {}
lib/fog/aws/simpledb.rb:74:          @connection_options     = options[:connection_options] || {}
lib/fog/aws/compute.rb:554:          @connection_options           = options[:connection_options] || {}

That generally makes sense; if connection options aren't set, ensure we pass an empty hash to Excon.

For the custom retry settings that have been applied only to Storage, we'd be better to merge any specified connection options over the defaults. I'll make that change.

geemus commented 9 months ago

I hope I covered it relatively clearly in the other issues, but just in case I'll briefly say that the idea (if I recall) was to make it easier to deal with the eventually consistent nature of AWS. ie if you create an object and then want to address it, you'd like that to just work even if it takes a little bit longer. That said, cases like what you point out show how that was inadvertently applied much too broadly and by seeking to make that case easier it introduces a lot of lag for other use cases. I'm not yet sure what the best approach or solution really is, but tried to leave some ideas and ask some questions across the issues, so I hope we'll be able to talk through them and figure something out. That being said, I'd like to chat a bit more before proceeding with this or similar patches to make sure we've considered the options and impacts thoroughly. Thanks.

geemus commented 9 months ago

@rahim did you need a release with this?

rajyan commented 9 months ago

Thank you very much for looking into this issue and thank you for your swift responses!

I’m not the author of this PR, but I think this issue reported in carrierwave seems related https://github.com/carrierwaveuploader/carrierwave/issues/2697 so I would be grateful if you could release it.

geemus commented 9 months ago

@rajyan Thanks, I hadn't seen that issue, though I'm not certain it will totally fix it either (it should hopefully at least help).

Released in v3.21.0.