sorentwo / oban

💎 Robust job processing in Elixir, backed by modern PostgreSQL and SQLite3
https://getoban.pro
Apache License 2.0
3.25k stars 308 forks source link

Oban Met - Configure Reporter Interval #1120

Open thiagopromano opened 1 month ago

thiagopromano commented 1 month ago

Is your feature request related to a problem? Please describe.

Our system executes millions of jobs every day, to aid that, we use DynamicPartitioner and ObanWeb.

During moments of high DB stress, one query that pops up using a large share of the total "Query Time" is the Oban.Met.Reporter query, counting the number of jobs in several queue states.

Although it has a built-in backoff considering the number of rows in each queue state, it is still too expensive and executed too often for our use case, especially during high-stress times.

Describe the Solution You'd Like

There are a few possible solutions, one of which is to allow us to configure the interval or backoff parameters.

The other option would be to allow disabling the Reporter entirely, loading the required data "on demand" when the Oban Web requires it (or never at all).

sorentwo commented 1 month ago

There are a few possible solutions, one of which is to allow us to configure the interval or backoff parameters.

This is the most likely option.

The other option would be to allow disabling the Reporter entirely, loading the required data "on demand" when the Oban Web requires it (or never at all).

Disabling the Reporter could also work, but it would cause a portion of the dashboard to be 0'd all the time. An "on demand" mode wouldn't work very well, as there's no guarantee that the web node hosting the dashboard is the one doing the counting. It would also break the charts, as there wouldn't be any historic information.

It may also be possible to optimize that query for partitioned tables (if you have a lot of retained completed/cancelled/discarded jobs it would perform numerous scans).

Will you send an email to support with your DynamicPartitioner config and more detailed information on the query time from the DB?

thiagopromano commented 1 month ago

An "on demand" mode wouldn't work very well, as there's no guarantee that the web node hosting the dashboard is the one doing the counting.

Would it be possible to enable the Reporter on the leader node only when the Oban Web is requesting data?

Or even load the information as it was before Oban Web 2.10.0 was introduced?

The main problem here is that to use ObanWeb, it is a hard requirement to have this full table count running 24/7. It should be possible to opt-out because it affects the whole system even when no one is using ObanWeb.

Will you send an email to support with your DynamicPartitioner config and more detailed information on the query time from the DB?

Yes, I've just sent an email with more info.