Closed axelson closed 6 months ago
The test failure doesn't look related to the PR ("test cron jobs are scheduled using the configured timezone (Oban.Plugins.CronTest)")
@axelson Thanks for opening this PR and kicking off this work. The linked commit was inspired by your efforts, and gives some attribution to this PR.
The approach in e7f91f4 differs in a few substantial ways:
Ah, that makes sense. I didn't think about telemetry.
Lightly extend the watchman's shutdown period, just long enough to emit the telemetry event rather than adding a quarter of a second to the total shutdown time for each queue.
Yeah I wasn't sure how long would be needed to reliably emit the event.
Thanks for adding support for this!
If jobs take longer than
:shutdown_grace_period
(by default 15 seconds) then they are brutally killed. Log a message when this occurs. The message looks like:This is important because the majority of the time in a running you usually want your jobs to finish within the
:shutdown_grace_period
and with this new log message it is now easier to detect when that doesn't happen and which jobs were affected which will allow developers to then improve those jobs so that they finish in time (e.g. by splitting a job into multiple units of work).I pass through the
conf.name
toOban.Queue.Watchman
so that it is able to check the appropriate queue status (which is especially important in the tests). I'm not in love withconf_name
but I went with it becausename
was already taken. As an aside I think the typespec forname
is incorrect, it is currentlymodule()
but it is actually receiving a via tuple like{:via, Registry, {Oban.Registry, {Oban, {:producer, "alpha"}}}}
I've added minimal test updates 9enough to verify the new behavior), if this PR looks good then I can make additional updates if needed.