Retrying shoryuken message failure in middleware

venkateshan-niladri commented 2 years ago

my shoryuken middleware is failing with this error :

can't add a new key into hash during iteration :: 
["/data/helpkit/shared/bundler_gems/ruby/2.3.0/gems/activerecord-3.2.22.5/lib/active_record/connection_adapters/abstract/connection_pool.rb:375:in `[]='", 
"/data/helpkit/shared/bundler_gems/ruby/2.3.0/gems/activerecord-3.2.22.5/lib/active_record/connection_adapters/abstract/connection_pool.rb:375:in `establish_connection'", 
"/data/helpkit/shared/bundler_gems/ruby/2.3.0/gems/active_record_shards-3.2.1/lib/active_record_shards/connection_switcher.rb:194:in `establish_shard_connection'", 
"/data/helpkit/shared/bundler_gems/ruby/2.3.0/gems/active_record_shards-3.2.1/lib/active_record_shards/connection_switcher.rb:162:in `switch_connection'"]

We are raising StandardError in such cases. Will the message be retried? If not, do we need to manually enqueue the message back to SQS queue ?

We are using auto_delete: true in shoryuken worker.

matt-taylor commented 2 years ago

Hey @venkateshan-niladri. Can you give a little more detail about your middleware and what version of Shoryuken and/or Rails (if used) you are using?

venkateshan-niladri commented 2 years ago

Hi @matt-taylor We are using the middleware to wrap the call the shoryuken worker saying the worker process has started and logging the thread details, class name, job id, DB shard etc.

Shoryuken version : 5.0.4 Rails : 3.2.22.5 Ruby : 2.3.8

Our middleware somewhat looks like this -

module Middleware
  module Shoryuken
    module Server
      class JobDetailsLogger
        LOG_FILE = 'xx.log'
        LOG_PATH = "#{Rails.root}/xx/#{LOG_FILE}"

        def call(worker_instance, queue, sqs_msg, body)
          started_at = Time.zone.now
          Rails.logger.info { "started at #{started_at}" }
          begin
            yield
          rescue StandardError => e
            Rails.logger.error "Error in Shoryuken #{e.message} :: #{e.backtrace[0..3].inspect}"
            raise
          ensure
            end_time = (Time.zone.now - started_at).round(2)
            Rails.logger.info { "completed in: #{end_time * 1000} ms" }
          end
        end
      end
    end
  end
end

This middleware is throwing the exception as mentioned in the issue description.

matt-taylor commented 2 years ago

Hi @venkateshan-niladri . To your exact question, It depends where in the middleware stack your middleware that is causing the failure is placed. If the failing middleware is placed after the ExponentialBackoffRetry middleware, then the message will get placed into the timeout retry policy. If it is before, then the message will be lost and you should see this error message https://github.com/ruby-shoryuken/shoryuken/blob/v5.0.4/lib/shoryuken/processor.rb#L25 in your logs.

Unfortunately Rails 3 is no longer supported in the rails community so I think we can be of minimal help here.

From what I see, the error is caused by the Active Record connection pool. Active Record pool issues can vary drastically by Rails versions. One thing to note is Shoryuken uses concurrency-ruby to spawn messages concurrently. Modifying database in the middleware might cause issues since the Database connections should be atomic after the boot sequence.

Also, I am not seeing how this raises in the middleware. To help you further,

Can you give me the full stack trace?
Did you recently update any gems, including concurrent-ruby or aws-sdk-core ?

github-actions[bot] commented 2 years ago

This issue is now marked as stale because it hasn't seen activity for a while. Add a comment or it will be closed soon.

github-actions[bot] commented 2 years ago

This issue was closed because it hasn't seen activity for a while.

ruby-shoryuken / shoryuken

Retrying shoryuken message failure in middleware #711