conductor-sdk / conductor-python

Conductor OSS SDK for Python programming language
Apache License 2.0
59 stars 30 forks source link

Conductor Python workers do not acknowledge SIGTERM #224

Open matteius opened 10 months ago

matteius commented 10 months ago

I spent some time trying to debug this Friday and from what I can gather, issuing SIGTERM to a python process that has spawned from the following pattern ignores SIGTERM and only responds to SIGKILL which is less safe:

        # Start the worker processes and wait for their completion
        with TaskHandler(self.workers, configuration) as task_handler:
            task_handler.start_processes()
            task_handler.join_processes()

We get sentry errors during deploys because of this issue -- I was poking around in the internals of this library and found something else odd too -- the code seem to prefer to kill over term the processes in the shutdown logic, and I believe its related to the fact that SIGTERM is ignored entirely. Ref: https://github.com/conductor-sdk/conductor-python/blob/main/src/conductor/client/automator/task_handler.py#L140-L149

Ask: Update library to cleanly handle SIGTERM which should be checked for in each iteration of the task runner and shutdown cleanly when invoked -- this will give the currently running task an opportunity to complete before an eventual SIGKILL would be issued on the timeout of SIGTERM (which is where today's error stems from).