flux-framework / flux-core

core services for the Flux resource management framework
GNU Lesser General Public License v3.0
168 stars 50 forks source link

job validator hang at exit #6434

Closed grondo closed 2 weeks ago

grondo commented 2 weeks ago

On a production system we've seen cases of the job validator hanging at exit. It appears that at the time there are multiple threads active, and the exiting main thread is blocked on a lock, possibly waiting for these threads.

Perhaps there is something unsafe that the validator (and likewise frobnicator) is doing with concurrent.futures.

grondo commented 2 weeks ago

Ah, probably the validator should be calling concurrent.futures.shutdown(wait=False, cancel_futures=True) before exiting.