The current hung jobs detection doesn't report if any jobs have been detected as hung and re-queued; all you see is this:
JQAutoscaler::detectHungJobs() running... done!
This led to a very painful debugging cycle where a poorly-configured maxRuntimeSeconds was causing our jobs to be constantly re-queued out-of-band, and we had the false assumption due to the above output that the hung jobs detector wasn't mutating jobs.
If it had reported X jobs re-queued, it would've pointed us to the problem quickly.
The current hung jobs detection doesn't report if any jobs have been detected as hung and re-queued; all you see is this:
This led to a very painful debugging cycle where a poorly-configured
maxRuntimeSeconds
was causing our jobs to be constantly re-queued out-of-band, and we had the false assumption due to the above output that the hung jobs detector wasn't mutating jobs.If it had reported X jobs re-queued, it would've pointed us to the problem quickly.