joekhoobyar / shoryuken-later

A scheduling plugin using Dynamo DB for Shoryuken
GNU Lesser General Public License v3.0
24 stars 6 forks source link

Poller not finding messages #6

Open farski opened 9 years ago

farski commented 9 years ago

I may just be doing some math wrong, but I seem to be having a problem.

I have a table, production_scour-audio_shoryuken-later, that has a lot of items in it. One item has a perform_at value of 1427650294.

As I'm writing this, (Time.now + Shoryuken::Later::MAX_QUEUE_DELAY).to_i is 1427675975, which is greater than the perform_at value of that item.

Running client.first_item 'production_scour-audio_shoryuken-later', 'perform_at' => { attribute_value_list: [ (Time.now + Shoryuken::Later::MAX_QUEUE_DELAY).to_i ], comparison_operator: 'LT' } returns nil.

If I do a scan of the table in the AWS Console [perform_at, Number, less than, 1427675975], I get a bunch of results. So maybe this is just an issue with the AWS SDK? As best I could tell, this was all working yesterday.

farski commented 9 years ago

If I run

ddb.scan(table_name: 'production_scour-audio_shoryuken-later', limit: 100, scan_filter: {"perform_at" => {attribute_value_list: [2427675975],comparison_operator:"LT"}})

(ie, some date far in the future) I get back the same number of items (82) that I get in the console by scanning with that number. However, if I run

ddb.scan(table_name: 'production_scour-audio_shoryuken-later', limit: 100, scan_filter: {"perform_at" => {attribute_value_list: [1427675975],comparison_operator:"LT"}})

I get back 0 items, whereas in the console, with that filter value, I get back some (67).

farski commented 9 years ago

Actually, what I just said isn't true. If the limit with the correct filter value is 1, I get back no results. If I increase even to 10 I start getting results.

farski commented 9 years ago

:limit => Integer

The maximum number of items to evaluate (not necessarily the number of matching items). If DynamoDB processes the number of items up to the limit while processing the results, it stops the operation and returns the matching values up to that point, and a key in LastEvaluatedKey to apply in a subsequent operation, so that you can pick up where you left off. Also, if the processed data set size exceeds 1 MB before DynamoDB reaches this limit, it stops the operation and returns the matching values up to the limit, and a key in LastEvaluatedKey to apply in a subsequent operation to continue the operation. For more information, see Query and Scan in the Amazon DynamoDB Developer Guide.

Based on that, it seems like having limit set to 1 could cause problems like this where the first item that gets evaluated doesn't meet the filter criteria, but there are items in the table that would meet the criteria.

joekhoobyar commented 9 years ago

Sounds like a good catch. I will pick this up at the office tomorrow morning, I think it is clearly a bug.

farski commented 9 years ago

It doesn't look like there's a way to limit the number of items being returned separately from the number being scanned, so it's probably unavoidable that the scans could/will end up retuning a bunch of items up to that 1 MB limit. Basically rather than next_item doing a separate scan, it probably makes sense to just work through however many items the un-limited scan returned.

While it's probably fairly safe to avoid dealing with the LastEvaluatedKey to continue a maxed-out scan, there are situations where that could potentially strand some items in the table forever. So it might be worth making subsequent out-of-cycle scans when scans do hit the imposed limit until they come back empty.

joekhoobyar commented 9 years ago

I think it is best to just deal with the LastEvaluatedKey in the proper way. I don't want to provide a solution that "works most of the time". It should handle edge cases correctly.

joekhoobyar commented 9 years ago

The SDK will handle paging through results for you, so I have committed a quick workaround for this issue so that you aren't held up waiting for a more robust solution. I really need to pull these items off in bulk, not just scan for a single item every time. I will push out a patched gem that you can use for now.

farski commented 9 years ago

Awesome, thanks.

Startouf commented 5 years ago

Hello, is this issue closed with the PR #8 then ?