OpenFn / kit

The bits & pieces that make OpenFn work. (diagrammer, cli, compiler, runtime, runtime manager, logger, etc.)
10 stars 9 forks source link

Fix worker death #668

Closed josephjclark closed 5 months ago

josephjclark commented 5 months ago

Short Description

This PR fixes an issue where the worker doesn't close down pooled child processes after uncaught exceptions, resulting in the worker refusing to work.

Related issue

Fixes #664

Implementation Details

What's basically happening is:

The engine should eventually be timing out the runs, but by this point lightning thinks they're dead and isn't listening to events. But I think the backlog will very slowly clear.

Anyway, as result of the fix, the error is handled gracefully, the pool re-allocates the worker thread, and everyone is happy.

QA Notes

I've added two integration tests, both of which reproduce very similar errors to main. And they both fail on main.

Checklist before requesting a review