The retry_count < max_retries check is performed after retry_count is incremented. This means we need to use <= (or equivalently > with the conditional blocks reversed) to check if the max retries was exceeded. Otherwise, we will have an off-by-one error in the check.
For example, if we have max_retries = 1, retry_count will be incremented to equal 1 before the check, the check will fail because retry_count(1) == max_retries(1) and not <, and therefore the job will not retry.
Solution
Flip the conditional blocks and use retry_count > max_retries so the check will fail as expected when the max_retries is exceeded and not before.
I also noticed that no error log is emitted when the job fails and the max_retries is exceeded, so I added a log for this case.
Problem
The
retry_count < max_retries
check is performed afterretry_count
is incremented. This means we need to use<=
(or equivalently>
with the conditional blocks reversed) to check if the max retries was exceeded. Otherwise, we will have an off-by-one error in the check.For example, if we have
max_retries = 1
,retry_count
will be incremented to equal1
before the check, the check will fail becauseretry_count(1) == max_retries(1)
and not<
, and therefore the job will not retry.Solution
Flip the conditional blocks and use
retry_count > max_retries
so the check will fail as expected when themax_retries
is exceeded and not before.I also noticed that no error log is emitted when the job fails and the
max_retries
is exceeded, so I added a log for this case.