bensheldon / good_job

Multithreaded, Postgres-based, Active Job backend for Ruby on Rails.
https://goodjob-demo.herokuapp.com/
MIT License
2.6k stars 193 forks source link

Clarification on max_threads option #538

Open hmnhf opened 2 years ago

hmnhf commented 2 years ago

Hi and thanks for this awesome gem!

I've become a bit confused about the max_threads option.

From reading the following two sections, at first I thought max_threads is a global option that defines the maximum number of threads across all queues. (Meaning that if there are two queues named A and B, their combined total threads' count won't exceed the number of max_threads.)

[From command-line options] Maximum number of threads to use for working jobs.

[From configuration options] sets the maximum number of threads to use when execution_mode is set to :async.

But after reading the following two sections, I figured it's probably the default max value for each queue's threads:

[In pool definition with queues] <participating_queues>:<thread_count> ... <thread_count>: a count overriding for this specific pool the global max-threads.

[In configuring database pool size] 1 connection per query pool thread e.g. --queues=mice:2;elephants:1 is 3 threads. Pool thread size defaults to --max-threads.

But, then again in the Database Connections section, there's the following part which gives the impression that max_threads is the max number of threads across all different queues.

pool: <%= ENV.fetch("RAILS_MAX_THREADS", 5).to_i + (ENV.fetch("GOOD_JOB_MAX_THREADS", 4).to_i %>

And then, there's the description on how to calculate the number of required threads by GoodJob to be set as GOOD_JOB_MAX_THREADS .

Assuming that I've understood this correctly, I think the confusion comes from the fact that the max_threads option and the GOOD_JOB_MAX_THREADS used in the pool size setting can be two different values. In other words, their name could be default_max_threads_per_queue and GOOD_JOB_REQUIRED_THREADS. Have I misunderstood something or is this correct?

bensheldon commented 2 years ago

@hmnhf thanks for opening this Issue! The Readme was recently updated to try to explain how to calculate total threads (#525) and I think it exposed a conceptual problem.

Briefly to answer your question: GOOD_JOB_MAX_THREADS is "default threads per query execution pool"

The name predates the ability to configure multiple query pools within a single process (e.g. the --queues=mice:2;elephants:1 syntax). The value in --queues overrides the max-threads value.

The example given for defining database.yml's pool: value is really just flat out wrong.

To address this:

philipqnguyen commented 4 months ago

@bensheldon so if I have the following queue "high_priority:7;default:4;low_priority:2;*" and GOOD_JOB_MAX_THREADS = 5 that means:

7 threads for high priority 4 threads for default 2 threads for low priority 5 threads for * (This '5' comes GOOD_JOB_MAX_THREADS). Totaling 18 threads due to the queue.

Additionally, goodjob needs 1 thread for a notifier. 1 thread for a cron. 1 thread for executor. totaling 3 threads as overhead for goodjob.

With GoodJob running separately from the web process, based on the above example the database.yml should have:

pool: 21

Is that right? I spent the last couple hours reading through the readme and various issues, and that's what I have deduced....

bensheldon commented 4 months ago

@philipqnguyen yep, that's correct number of threads GoodJob will need from the database and the minimum value you should safely have in your database.yml.

I'm also recommending that people don't set the minimum but rather just set like 50 or 100 and don't worry about it from the perspective of the application. You'll need to have those connections available from the database, but trying to set a minimum in the database.yml isn't necessary and can lead to not having enough database connections available.

For example, if you're using Active Record load_async in any jobs to further parallelize work, you'll run out of database connections in Active Records connection pool.

philipqnguyen commented 4 months ago

Thank you for confirming @bensheldon