Problem: The job-exec module drains nodes with what it considered "unkillable" processes after max-kill-count attempts have been made to terminate the job shell. However, it is difficult for admins to determine how long that actually took, because the module uses an exponential backoff up to a max of 300s when retrying to kill the job shell.
Consider logging the total time waited until draining nodes for reference.
Problem: The job-exec module drains nodes with what it considered "unkillable" processes after
max-kill-count
attempts have been made to terminate the job shell. However, it is difficult for admins to determine how long that actually took, because the module uses an exponential backoff up to a max of 300s when retrying to kill the job shell.Consider logging the total time waited until draining nodes for reference.