Closed garlick closed 1 week ago
Hmm, got a failure in t2274-manager-perilog.t
in the el8 - test install builder
:
2024-11-13T14:53:17.5390080Z expecting success:
2024-11-13T14:53:17.5390383Z printf "#!/bin/sh\nsleep 60" > prolog.d/sleep.sh &&
2024-11-13T14:53:17.5390759Z chmod +x prolog.d/sleep.sh &&
2024-11-13T14:53:17.5391130Z test_when_finished "rm -f prolog.d/sleep.sh" &&
2024-11-13T14:53:17.5391569Z jobid=$(flux submit --job-name=cancel hostname) &&
2024-11-13T14:53:17.5392027Z flux job wait-event -t 15 $jobid prolog-start &&
2024-11-13T14:53:17.5392379Z flux cancel $jobid &&
2024-11-13T14:53:17.5392717Z flux job wait-event -t 15 $jobid prolog-finish &&
2024-11-13T14:53:17.5393154Z flux job wait-event -t 15 $jobid exception &&
2024-11-13T14:53:17.5393557Z test_must_fail flux job attach -vE $jobid
2024-11-13T14:53:17.5393790Z
2024-11-13T14:53:17.5394040Z 1731506489.783088 prolog-start description="job-manager.prolog"
2024-11-13T14:53:17.5394571Z flux-job: wait-event timeout on event 'prolog-finish'
2024-11-13T14:53:17.5395333Z not ok 12 - perilog: job can be canceled while prolog is running
I wouldn't think the kill-timeout would be used here. I'll restart and see if it pops up again.
OK, setting MWP. Thanks!
Problem: on a slow system, the prolog kill timeout may be exceeded due to reasons other than an intransigent prolog process.
Increase the timeout from 10s to 1m.
Fixes #6420