Shopify / job-iteration

Makes your background jobs interruptible and resumable by design.
https://www.rubydoc.info/gems/job-iteration
MIT License
1.12k stars 43 forks source link

job-iteration can cause infinite loop when using multi-column cursor in different tables with the same attribute #457

Open pedropb opened 8 months ago

pedropb commented 8 months ago

Reproduction steps:

Expected behaviour:

Other notes

pedropb commented 8 months ago

My initial investigation points to cursor_value not considering the table names when updating the cursor value.

These are the queries logged:

Product Load (0.9ms)  SELECT `products`.* FROM `products` INNER JOIN `comments` ON `comments`.`product_id` = `products`.`id` ORDER BY products.id,comments.id LIMIT 2
Product Load (1.0ms)  SELECT `products`.* FROM `products` INNER JOIN `comments` ON `comments`.`product_id` = `products`.`id` WHERE (products.id > '2' OR (products.id = '2' AND (comments.id > '2'))) ORDER BY products.id,comments.id LIMIT 2
Product Load (1.0ms)  SELECT `products`.* FROM `products` INNER JOIN `comments` ON `comments`.`product_id` = `products`.`id` WHERE (products.id > '3' OR (products.id = '3' AND (comments.id > '3'))) ORDER BY products.id,comments.id LIMIT 2

vs the expected positions:

# [ product.id, comment.id ]
[[1, 1], [2, 2], [2, 3], [3, 4], [3, 5], [3, 6]]

Once the cursor hits product.id = 3 it gets stuck, because the queries always returns [3, 4], [3, 5], and cursor_value does not use the table_name (iow: column.to_s.split(".").first) on @columns to update the value on the cursor:

https://github.com/Shopify/job-iteration/blob/24c30536a2390584db8c7f7a7694641ceb6acc10/lib/job-iteration/active_record_enumerator.rb#L46-L54