riverqueue / river

Fast and reliable background jobs in Go
https://riverqueue.com
Mozilla Public License 2.0
3.22k stars 86 forks source link

Consider per-worker timeout overrides when rescuing jobs #350

Closed brandur closed 3 months ago

brandur commented 3 months ago

This one came up when I was thinking about the job specific rescue threshold floated in [1].

I was going to suggest the possible workaround of setting an aggressive rescue threshold combined with a low job timeout globally, and then override the timeout on any specific job workers that needed to run longer than the new low global job timeout. But then I realized this wouldn't work because the job rescuer doesn't account for job-specific timeouts -- it just rescues or discards everything it finds beyond the run's rescue threshold.

Here, add new logic to address that problem. Luckily we were already pulling worker information to procure what might be a possible custom retry schedule, so we just have to piggyback onto that to also examine a possible custom work timeout.

[1] https://github.com/riverqueue/river/issues/347

brandur commented 3 months ago

Cool, done in https://github.com/riverqueue/river-homepage/pull/93.