Fix two badly intermittent tests for periodic job enqueuer

brandur commented 2 weeks ago

We're getting some really bad intermittency from a pair of tests in the periodic job enqueuer [1] [2] that need fixing ASAP since they're failing at least one matrix entry in almost every build.

Looking at the tests, they add a pair of jobs after starting the client, each of which operates on a 500 ms interval, and one with RunOnStart. The test case waits for the initial RunOnStart job to appear, then waits again for the next job that wasn't RunOnStart.

I think what's happening is that since 500ms is fairly long for a computer, it's not crazy long, so it's actually possible for the tests in CI to be operating slowly enough that by the time the enqueuer gets around to its first run, it actually has been 500ms already, and therefore both jobs have been enqueued, even the non-RunOnStart one.

Another thing that I realize looking closer at these test cases is that because of the way they're structured, they're slower than hell. Each one requires a wait of 500 ms because it's waiting for the non-RunOnStart job, so running them at high -count iterations is a total non-starter.

Here, I'm suggesting that we simplify things by:

Increase period by 10x from 500ms to 5s.
Don't bother waiting for the non-RunOnStart job to be enqueued.

The fact that the RunOnStart job has been confirmed is a pretty good chance that both jobs were indeed added properly, so I don't think we have to wait for the other one too. This resolves the intermittency problem, but also makes each test case probably on the order of 10,000 times faster.

Fixes #409.

[1] https://github.com/riverqueue/river/actions/runs/9770889034/job/26972789182?pr=413 [2] https://github.com/riverqueue/river/actions/runs/9770889034/job/26972789334?pr=413

brandur commented 2 weeks ago

@bgentry Ran the test matrix 4x times and although I got a different intermittent error (will try to fix that separately), this pull seems to fix its target.

brandur commented 1 week ago

+1. Thanks!

riverqueue / river

Fix two badly intermittent tests for periodic job enqueuer #416