Open winston opened 8 years ago
@shinnyx @zamakkat @dtthaison @joshteng FYI.
@winston thx for sharing. it's helpful!
@winston thanks for the update :+1:
Hey @winston!
I thought I'd share our current configuration at Hired. We have pretty much the same stack! Heroku running Sidekiq (Pro) and Puma for the web server. I don't do any of the fancy dynamic sizing calculations that you do, we control everything by environment variables.
Here's our simplified Procfile:
web: puma -C config/puma.rb
worker: bundle exec sidekiq -c ${SIDEKIQ_CONCURRENCY:-5} -i ${DYNO:-1} -q <queue priorities>
clock: bundle exec clockwork Clockfile.rb
Puma config:
min_threads = Integer(ENV['PUMA_MIN_THREADS'] || 0)
max_threads = Integer(ENV['PUMA_MAX_THREADS'] || 3)
threads min_threads, max_threads
port Integer(ENV['PORT'] || 3000)
environment ENV['RACK_ENV']
activate_control_app
state_path 'tmp/puma.state'
if ENV['PUMA_WORKERS'].to_i > 1
workers ENV['PUMA_WORKERS']
preload_app!
on_worker_boot do
# Valid on Rails 4.1+ using the `config/database.yml` method of setting `pool` size
# https://devcenter.heroku.com/articles/deploying-rails-applications-with-the-puma-web-server#on-worker-boot
ActiveRecord::Base.establish_connection
ActiveRecord::Base.connection.execute('set statement_timeout to 10000')
end
end
on_restart do
Sidekiq.redis.shutdown { |conn| conn.close }
end
Sidekiq config:
Sidekiq.configure_server do |config|
config.redis = { url: ENV["REDIS_URL"], namespace: :resque }
config.reliable_fetch!
database_url = ENV['DATABASE_URL']
if database_url
ENV['DATABASE_URL'] = "#{database_url}?pool=250"
ActiveRecord::Base.establish_connection
end
$elastic = Elasticsearch::Client.new
Stretchy.client = $elastic
end
Sidekiq.configure_client do |config|
config.redis = { url: ENV["REDIS_URL"], namespace: :resque }
end
Sidekiq::Client.reliable_push! unless Rails.env.test?
We have found that Heroku's performance dynos are phenomenally more performant than the standard ones, and they come with tons of RAM so we can fit many copies of the app in memory. This allows me to use Puma's cluster mode and currently we run 12 Puma processes (PUMA_WORKERS) per dyno, each Puma using up to 3 threads (PUMA_MAX_THREADS). I probably could increase the workers significantly and still have enough memory. For Hired production we run 2 Performance-L dynos, and an additional 1 Performance-L for admins-only (our internal admin app).
For Sidekiq, I use cheaper Standard-2X dynos and have a Hubot script that auto scales-them based on queue depth. Currently I have SIDEKIQ_CONCURRENCY=5 and we always run at least 2 or 3 worker dynos and up to 12 at peak times.
We also run a single Clock dyno for scheduled jobs. Mostly these jobs just kick off other jobs or log stuff, so it's on a Standard-1X plan.
I hope this helps!
Hey @heythisisnate, this is awesome! Thanks for sharing! Good to know what hired is using.
I haven't had a chance to use Performance
dynos yet because none of the apps I managed have reached that magnitude, but 2X have been a staple for us for quite a while now because we have noticed that the memory footprint of Ruby apps have grown quite a bit, and it hits 512MB limit (1X) pretty easily. For 2X (1GB) memory, we only turn up about 1-3 PUMA_WORKERS
depending on the app, and sometimes there would be one or two gnarly memory leaks that we have to hunt down. Most of the time, we PUMA_MAX_THREADS as 5 though. Extrapolating that to 14GB Performance-L servers, numbers seem about right (our puma config is very much the same as yours).
RE: sidekiq, most of what I do is through ENVs as well, except that client_redis_size
and server_concurrency_size
are derived as formulas.
I noticed that you didn't set size
under Sidekiq.configure_client
though and from this article it seems to suggest that the number can/should be tweaked (otherwise I think it defaults to 25? - might be wasteful? or am I wrong?).
For Sidekiq.configure_server
, we set the size via the formula here, instead of through the ENV variable (${SIDEKIQ_CONCURRENCY:-5}
) so that I don't have to do the math every time I change certain values. Haha.
For ENV['DATABASE_URL'] = "#{database_url}?pool=250"
, if you are already using database.yml
, I think the pool: 250
can go into database.yml` too.
Thoughts? Thanks for your reply! :bow:
:+1:
hi @winston, dividing by 2 in: puma_workers * (puma_threads / 2) * web_dynos
could cause client_redis_size = 0
when puma_threads = 1
Btw, I am in use of redis-objects which need at least one connection per app instance (web). Is it better to share the connection pool between sidekiq and redis-objects or I should use another connection pool with size equal to client_redis_size
?
@longkt90 Thanks for the feedback!
could cause client_redis_size = 0
That's true. Probably should modify the puma_threads
method to be:
def puma_threads
[2, Integer(ENV.fetch("WEB_MAX_THREADS", 5))].max
end
So that the minimum is at least 2.
@longkt90 I haven't used redis-objects myself, but I would think this might be better?
use another connection pool with size equal to
client_redis_size
Yeah. That's what we are using. I dont think it's good idea to share the pool between sidekiq and redis-object On Fri, Dec 25, 2015 at 11:57 AM Winston notifications@github.com wrote:
@longkt90 https://github.com/longkt90 I haven't used redis-objects myself, but I would think this might be better?
use another connection pool with size equal to client_redis_size
— Reply to this email directly or view it on GitHub https://github.com/jollygoodcode/jollygoodcode.github.io/issues/12#issuecomment-167189653 .
Btw I need to set our db_pool to be redis server concurrency + puma-max-threads. Is that correct? On Fri, Dec 25, 2015 at 12:07 PM Nguyễn Thanh Long longkt90@gmail.com wrote:
Yeah. That's what we are using. I dont think it's good idea to share the pool between sidekiq and redis-object On Fri, Dec 25, 2015 at 11:57 AM Winston notifications@github.com wrote:
@longkt90 https://github.com/longkt90 I haven't used redis-objects myself, but I would think this might be better?
use another connection pool with size equal to client_redis_size
— Reply to this email directly or view it on GitHub https://github.com/jollygoodcode/jollygoodcode.github.io/issues/12#issuecomment-167189653 .
Btw I need to set our db_pool to be redis server concurrency + puma-max-threads.
Yes that's right. I just set it as the max number of connection that my DB allows.
:+1: Thank you
It seems to me that this code:
config.redis = {
url: ENV['REDISCLOUD_URL'],
size: sidekiq_calculations.client_redis_size
}
And this code seems to have an issue:
def client_redis_size
return DEFAULT_CLIENT_REDIS_SIZE if !Rails.env.production?
puma_workers * (puma_threads/2) * web_dynos
end
It should just be:
def client_redis_size
return DEFAULT_CLIENT_REDIS_SIZE if !Rails.env.production?
puma_threads/2
end
puma_workers
and web_dynos
are not relevant for the connection pool the pool is shared only within the process and puma workers and dynos are separate process to each other.
The server_concurrency_size
is OK because if you ran bellow CONCURRENCY + 2
sidekiq will fail to start, and will not heart much.
However after changing client_redis_size
you do need to modify server_concurrency_size
to use puma_workers
and web_dynos
.
You probably didn't get any error because you just reserved bigger pool size to client than you really needed.
@nitzanav The code was transcribed following the explanations detailed in http://bryanrite.com/heroku-puma-redis-sidekiq-and-connection-limits/.
Sidekiq config is often a mystery (to me) and sometimes difficult to know what's exactly "right". The config above has at least worked in all the apps I deployed so far. But do let me know your mileage on the updated config. I am sure it will be a good data point too. Thanks!
@winston This blog indeed shows this formula, but the usage of it as is, is meant in order to be bale to infer the maximum connections to be expected and configured on the redis server side, rather than the ruby application side. The ruby application side is a bit different, and AFAIK should be used as I described.
What you did will work but is no optimum :)
@winston at the time of auto scale how we will update - NUMBER_OF_WEB_DYNOS
@winston I realize this is a couple years old now but I'm only now learning about and using Puma and concurrency so this may still be relevant to anyone else. To help clarify what @nitzanav is saying, I think you're misinterpreting what the blog post is saying in regards to setting the Redis client size. In fact, in the blog Bryan Rite has the following code for setting the size:
Sidekiq.configure_client do |config|
config.redis = { size: 3, url: ENV["REDIS_URL"], namespace: "your-app" }
end
These sidekiq settings will configure the size per worker process, so you don't need to factor in the puma_workers *
or * web_dynos
operations. The puma_workers * (puma_threads/2) * web_dynos
formula is just telling you what the expected total number of connections will be if your app dynos were to fully utilize each worker and thread.
Hope this helps!
What's the optimum config for Sidekiq on Heroku with Puma?
There are quite a number of answers on the Internet, but nothing definitive, and most of them come with vague numbers and suggestions or are outdated.
Basically, these are the questions that are often asked:
config/initializers/sidekiq.rb
file?size
?concurrency
?The best (and updated) answers I can find are:
With @bryanrite's post as a reference, this is our Sidekiq config:
config/initializers/sidekiq.rb
lib/sidekiq_calculations.rb
The
sidekiq_calculations.rb
file is dependent on a number of ENV variables to work, so if you do scale your app (web or workers), do remember to update these ENVs:MAX_REDIS_CONNECTION
NUMBER_OF_WEB_DYNOS
NUMBER_OF_WORKER_DYNOS
At the same time,
WEB_CONCURRENCY
andWEB_MAX_THREADS
should be the identical ENV variables used to set the number of Puma workers and threads inconfig/initializers/puma.rb
.Our
puma.rb
looks exactly like what Heroku has proposed.The only difference to @bryanrite's calculation is that Sidekiq reserves 5 connections instead of 2 now according to this line, and I have also added a
paranoid_divisor
to bring down the concurrency number and keep it below a 80% threshold.Let me know how this config works for you. Would love to hear your feedback!
Thank you for reading.
@winston :pencil2: Jolly Good Code
About Jolly Good Code
We specialise in Agile practices and Ruby, and we love contributing to open source. Speak to us about your next big idea, or check out our projects.