sorentwo / oban

💎 Robust job processing in Elixir, backed by modern PostgreSQL and SQLite3
https://getoban.pro
Apache License 2.0
3.18k stars 297 forks source link

Ensure uniqueness across all args if no keys are specified #1031

Closed sannek closed 5 months ago

sannek commented 5 months ago

Hello!

Today I spent a good chunk of time chasing down some weirdness with certain jobs not getting scheduled, and I'm pretty sure this is a bug 🐛 . In certain cases, the order of job insertion affects whether or not uniqueness is enforced. (I noticed this with Oban.Pro.Engine.Smart , and that seems to use the same function for determining uniqueness, so if this is an acceptable fix it would be great if it can be fixed there too!)

Given a worker that has unique: [fields: [:args]] specified the following happens: inserting A first, and then B results in two separate jobs, as expected. But if you insert B first, and then insert A, it is considered to not be unique compared to B.

a = MiniUniq.new(%{id: 1})
b = MiniUniq.new(%{id: 1, extra: :cool_beans})

Initially I just wanted to open an issue, but I got curious and figured I might as well try to fix it. There is no Postgres operator to determine if two jsonb fields are exactly equal, so the best way seems to be to use both the @> and <@ operators to ensure that they are completely the same. I'm not very familiar with SQLite, so I extrapolated the same principle into that engine, but there might be a better way.

I added a test that initially confirmed the unexpected behaviour, and now confirms that it is consistent.

Lastly - locally mix format --check-formatted failed, but it is only upset by files that have not been changed in this PR, so I'm not sure what the best approach is there.

sorentwo commented 5 months ago

Thank you for taking the time to track this down and not only report it, but submit a PR! I've run the updated query in the web demo's database to check the query plan and timing, and it's just as quick as a single containment check (note the data is fake, that's not a real email):

-- Naive with =, unoptimized

 Gather  (cost=1000.00..76844.94 rows=1 width=8) (actual time=347.650..349.367 rows=1 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Parallel Seq Scan on oban_jobs  (cost=0.00..75844.84 rows=1 width=8) (actual time=274.450..338.798 rows=0 loops=3)
         Filter: (args = '{"email": "shanel1968@schneider.net", "project": "web-enabled"}'::jsonb)
         Rows Removed by Filter: 269338
 Planning Time: 0.106 ms
 Execution Time: 349.399 ms
(8 rows)

-- Original, using a single @>

 Bitmap Heap Scan on oban_jobs  (cost=182.13..271.97 rows=81 width=8) (actual time=2.233..2.234 rows=1 loops=1)
   Recheck Cond: (args @> '{"email": "shanel1968@schneider.net", "project": "web-enabled"}'::jsonb)
   Heap Blocks: exact=1
   ->  Bitmap Index Scan on oban_jobs_args_index  (cost=0.00..182.11 rows=81 width=0) (actual time=2.221..2.221 rows=1 loops=1)
         Index Cond: (args @> '{"email": "shanel1968@schneider.net", "project": "web-enabled"}'::jsonb)
 Planning Time: 0.124 ms
 Execution Time: 2.262 ms

-- Accurate, using both @> and <@

 Bitmap Heap Scan on oban_jobs  (cost=163.41..253.45 rows=1 width=8) (actual time=2.000..2.001 rows=1 loops=1)
   Recheck Cond: (args @> '{"email": "shanel1968@schneider.net", "project": "web-enabled"}'::jsonb)
   Filter: (args <@ '{"email": "shanel1968@schneider.net", "project": "web-enabled"}'::jsonb)
   Heap Blocks: exact=1
   ->  Bitmap Index Scan on oban_jobs_args_index  (cost=0.00..163.41 rows=81 width=0) (actual time=1.989..1.989 rows=1 loops=1)
         Index Cond: (args @> '{"email": "shanel1968@schneider.net", "project": "web-enabled"}'::jsonb)
 Planning Time: 0.135 ms
 Execution Time: 2.023 ms

Finally, I pushed a commit that addressed a module redefinition warning, made cond consistent with others in the codebase, and only encodes the args once in the lite engine.

sannek commented 5 months ago

Thank you for fixing up my PR and reviewing and merging it so quickly! You can probably imagine it was a bit of a headscratcher 😅

sorentwo commented 5 months ago

Thank you for fixing up my PR and reviewing and merging it so quickly!

No, thank you 🙂

These fixes are also ported to Pro for the next patch release.