flux-framework / flux-core

core services for the Flux resource management framework
GNU Lesser General Public License v3.0
167 stars 50 forks source link

shell: doom: include hostname of rank that caused early exit if possible #6050

Closed grondo closed 3 months ago

grondo commented 3 months ago

If a parallel job within a batch script gets terminated by the doom plugin, only the early exiting task id is printed in the error message, not the associated hostname.

If possible, the hostname should be included because after a batch job terminates, flux job taskmap can no longer be used to map the taskid to a hostname.

grondo commented 3 months ago

Closed by #6056