Open grondo opened 1 week ago
It may be useful to add some more stats to flux module stats job-manager
such as the number of pending sched.alloc
and sched.cancel
requests, where the latter is defined as a sched.alloc
request that is still pending even after a cancel request has been sent.
I don't see anything that prevents multiple sched.cancel
requests from being sent for the same pending alloc request although I'm not sure in what situation that would occur.
In flux-framework/flux-sched#1222 @trws observed
One theory proposed is that
This issue is open to investigate the situation to ensure the job-manager isn't doing something wrong here.