sidekiq / sidekiq

Simple, efficient background processing for Ruby
https://sidekiq.org
Other
13.13k stars 2.41k forks source link

Timeout::Error Waited 1 sec #4318

Closed KidA001 closed 4 years ago

KidA001 commented 4 years ago

Ruby version: 2.6.3 Rails 5.1 Sidekiq 6.0.0

I am running a rake task that stays alive and receives messages from PubSub and immediately sends them to a background job to be processed via ActiveJob. We manage both of these in our Procfile:

worker: bundle exec sidekiq
pubsub: bundle exec rake run_analytics_queue_processor

and run our application with bundle exec foreman start

We have config.active_job.queue_adapter = :sidekiq set in our application.rb

Our sidekiq initializer looks like:

url = if CREDENTIALS.include?(:redis_url)
        CREDENTIALS[:redis_url]
      else
        'redis://127.0.0.1:6379/0'
      end

Sidekiq.configure_server do |config|
  config.redis = { url: url, id: nil, timeout: 5  }
end

Sidekiq.configure_client do |config|
  config.redis = { url: url, id: nil, size: 12, timeout: 5 }
end

Our puma config

workers 2
threads 6, 6

preload_app!

rackup      DefaultRackup

port        ENV.fetch('PORT') { 3000 }

environment ENV.fetch('RAILS_ENV') { 'local' }

on_worker_boot do
  begin
    ActiveRecord::Base.connection.disconnect!
  rescue StandardError
    ActiveRecord::ConnectionNotEstablished
  end
  ActiveRecord::Base.establish_connection
end

plugin :tmp_restart

We get bursts of a few thousand messages in a minute. Most of them get processed, but at some point we start receiving Timeout::Error: Waited 1 sec when calling perform_later.

I have spent hours adjusting the pool size and timeout to no avail. What am I missing and what in my config is incorrect? Any help greatly appreciated

mperham commented 4 years ago

Have you read the Using Redis wiki page?

On Oct 7, 2019, at 20:28, Brian Vogelgesang notifications@github.com wrote:

 Ruby version: 2.6.3 Rails 5.1 Sidekiq 6.0.0

I am running a rake task that stays alive and receives messages from PubSub and immediately sends them to a background job to be processed via ActiveJob. We manage both of these in our Procfile:

worker: bundle exec sidekiq pubsub: bundle exec rake run_analytics_queue_processor and run our application with bundle exec foreman start

We have config.active_job.queue_adapter = :sidekiq set in our application.rb

Our sidekiq initializer looks like:

url = if CREDENTIALS.include?(:redis_url) CREDENTIALS[:redis_url] else 'redis://127.0.0.1:6379/0' end

Sidekiq.configure_server do |config| config.redis = { url: url, id: nil, timeout: 5 } end

Sidekiq.configure_client do |config| config.redis = { url: url, id: nil, size: 12, timeout: 5 } end Our puma config

workers 2 threads 6, 6

preload_app!

rackup DefaultRackup

port ENV.fetch('PORT') { 3000 }

environment ENV.fetch('RAILS_ENV') { 'local' }

on_worker_boot do begin ActiveRecord::Base.connection.disconnect! rescue StandardError ActiveRecord::ConnectionNotEstablished end ActiveRecord::Base.establish_connection end

plugin :tmp_restart We get bursts of a few thousand messages in a minute. Most of them get processed, but at some point we start receiving Timeout::Error: Waited 1 sec when calling perform_later.

I have spent hours adjusting the pool size and timeout to no avail. What am I missing and what in my config is incorrect? Any help greatly appreciated

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

KidA001 commented 4 years ago

@mperham I have. I spent the last couple days troubleshooting this before coming here. I tried setting network_timeout but that didn't change. My redis instance is set to 1GB and we only use about 60-100mb. I have been watching the connections and see them spike to around 80.

I thought the high-traffic might be causing a latency issue, but setting timeout and network_timeout doesn't seem to be changing anything. The error message continues to says waited 1 sec regardless of what I set it to so I'm a little perplexed.

According to the redis gem docs setting timeout should set all 3 read/write/network

mperham commented 4 years ago

Have you gone thru the slowlog to see if there are operations which are spiking latency?

On Oct 7, 2019, at 21:23, Brian Vogelgesang notifications@github.com wrote:

 @mperham I have. I spent the last couple days troubleshooting this before coming here. I tried setting network_timeout but that didn't change. My redis instance is set to 1GB and we only use about 60-100mb. I have been watching the connections and see them spike to around 80.

I thought the high-traffic might be causing a latency issue, but setting timeout and network_timeout doesn't seem to be changing anything. The error message continues to says waited 1 sec regardless of what I set it to I'm a little perplexed.

According to the redis gem docs setting timeout should set all 3 read/write/network

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

KidA001 commented 4 years ago

@mperham we provisioned redis through google's memory store via terraform. It is unclear to me at the moment how I can get access to the redis-cli to check slowlog. I'm digging through docs to see where I can SSH in and do this, thank you.

In the meantime I would love to bandaid the timeout issue, it's being thrown when I call perform_later - Do you have any thoughts as to what setting/argument I need to update to increase timeout from 1s? I'm unclear if it's a sidekiq worker timeout, or redis connection settings or google PubSub

image

There's some meta-programming here but klass.perform_later is an active job object. Any thoughts on the error above?

mperham commented 4 years ago

Adding network_timeout: 5 to the Redis config in your Sidekiq initializer should work.

Perform_later is ActiveJob and AJ’s APIs are slower and less efficient than Sidekiq’s native APIs 100% of the time. This could be one issue.

On Oct 7, 2019, at 21:50, Brian Vogelgesang notifications@github.com wrote:

 @mperham we provisioned redis through google's memory store via terraform. It is unclear to me at the moment how I can get access to the redis-cli to check slowlog. I'm digging through docs to see where I can SSH in and do this, thank you.

In the meantime I would love to bandaid the timeout issue, it's being thrown when I call perform_later - Do you have any thoughts as to what setting/argument I need to update to increase timeout from 1s? I'm unclear if it's a sidekiq worker timeout, or redis connection settings or google PubSub

There's some meta-programming here but klass.perform_later is an active job object. Any thoughts on the error above?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

fjyaniez commented 4 years ago

@KidA001 did you find a solution? I can also confirm network_timeout doesn't make a difference, same error message: Timeout::Error Waited 1 sec

pazaricha commented 4 years ago

For anyone interested, the "1 second" comes from here: https://github.com/mperham/sidekiq/blob/4338695727d0bf16c9bf90d4170c55232bfc0957/lib/sidekiq/redis_connection.rb#L31

You can change this value by setting pool_timeout as an option inside for your sidekiq_server redis configuration like so:

Sidekiq.configure_server do |config|
  config.redis = { url: Rails.application.secrets.redis_url, network_timeout: 5, pool_timeout: 5, size: 22 }
end
jusmat commented 4 years ago

For anyone interested, the "1 second" comes from here:

https://github.com/mperham/sidekiq/blob/4338695727d0bf16c9bf90d4170c55232bfc0957/lib/sidekiq/redis_connection.rb#L31

You can change this value by setting pool_timeout as an option inside for your sidekick server redis configuration like so:

Sidekiq.configure_server do |config|
  config.redis = { url: Rails.application.secrets.redis_url, network_timeout: 5, pool_timeout: 5, size: 22 }
end

I would say, increasing pool_timeout is just running away from the issue that it takes longer than 1 second to get the connection from the pool. I'd strongly suggest to look on why isn't 1 second enough, rather than just increasing the timeout.

pazaricha commented 4 years ago

For anyone interested, the "1 second" comes from here: https://github.com/mperham/sidekiq/blob/4338695727d0bf16c9bf90d4170c55232bfc0957/lib/sidekiq/redis_connection.rb#L31

You can change this value by setting pool_timeout as an option inside for your sidekick server redis configuration like so:

Sidekiq.configure_server do |config|
  config.redis = { url: Rails.application.secrets.redis_url, network_timeout: 5, pool_timeout: 5, size: 22 }
end

I would say, increasing pool_timeout is just running away from the issue that it takes longer than 1 second to get the connection from the pool. I'd strongly suggest to look on why isn't 1 second enough, rather than just increasing the timeout.

I generally agree. My intent here was mostly to help folks understand what is the sources of the "1-second" value.