Closed kmg-stripe closed 3 months ago
537 tests 531 :white_check_mark: 7m 48s :stopwatch: 139 suites 6 :zzz: 139 files 0 :x:
Results for commit c491e86f.
:recycle: This comment has been updated with latest results.
@Andyz26 We're hoping to get this one and https://github.com/Netflix/mantis/pull/696 out today in order to meet a prod readiness deadline of tomorrow. We're close!
@crioux-stripe @Andyz26 is out-of-office. I have reviewed the PR and the change makes sense to me. Let me know if you need more help.
@hmitnflx thanks!
One minor comment, @kmg-stripe I found it useful to track total time spent getting all records in paginated results. Also, if there's a way to specify page-size to dynamodb, it'd be great.
The change looks good otherwise. 🍀 for prod readiness
ack! i'll add that in a follow-on PR, since we want to verify this fixes the issue we are seeing.
Also setting explicit pagination limit.
This should fix a pretty big problem:
We have hit a case where the master persistently thinks most jobs are inactive, so every time a new leader is elected most jobs are resubmitted. Over time, new jobs have piled up (100s per cluster), which has made the problem worse, since Dynamo is only returning ~30 partition keys (jobIds from MantisWorkers).
Context
Explain context and other details for this pull request.
Checklist
./gradlew build
compiles code correctly./gradlew test
passes all tests