A question sysadmins and developers get often is "why is job X not running?"
It seems like Fluxion could provide insigths to make this question easier to answer, perhaps even in the output of flux jobs.
Some reasons that we have to manually determine now include:
waiting for higher priority jobs to be scheduled
constraints provided for resources that are currently unavailable
highest priority job, but waiting for resources to become available
A simple solution would be for Fluxion to return a reason or similar field in the scheduler annotations when it can provide one. This could be made available in flux jobs.
Another, perhaps longer term solution would be to provide an RPC that unveils a snapshot of the current schedule if one could be made available.
A question sysadmins and developers get often is "why is job X not running?"
It seems like Fluxion could provide insigths to make this question easier to answer, perhaps even in the output of
flux jobs
. Some reasons that we have to manually determine now include:A simple solution would be for Fluxion to return a
reason
or similar field in the scheduler annotations when it can provide one. This could be made available influx jobs
.Another, perhaps longer term solution would be to provide an RPC that unveils a snapshot of the current schedule if one could be made available.