alexsilva / supervisor

Supervisor process control system for Windows
http://supervisord.org
Other
118 stars 26 forks source link

stopsignal=CTRL_BREAK_EVENT crashes supervisor when run as a service #27

Closed philipstarkey closed 3 years ago

philipstarkey commented 3 years ago

When attempting to adapt the feature in #17 for my own processes, I've run into an issue where supervisor crashes with the following exception:

FAILED: unknown problem killing test (33384):OSError: [WinError 6] The handle is invalid

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "c:\<redacted>\supervisor\supervisor\process.py", line 615, in kill
    options.kill(pid, sig)
  File "c:\<redacted>\supervisor\supervisor\options.py", line 1170, in kill
    output = subprocess.process.kill2(sig, pid < 0, self.logger)
  File "c:\<redacted>\supervisor\supervisor\helpers.py", line 53, in kill2
    return self.send_signal(sig)
  File "C:\Anaconda3\envs\test_env\lib\subprocess.py", line 1432, in send_signal
    os.kill(self.pid, signal.CTRL_BREAK_EVENT)
SystemError: <built-in function kill> returned a result with an error set

Relevant supervisor config

[program:test]
command=C:\\path\\to\\python.exe C:\\path\\to\\simple\\long-lived\\python\\script.py
directory=C:\\path\\to\\simple\\long-lived\\python\\
autostart=true
startsecs=0
autorestart=true
stopwaitsecs=5
killasgroup=false
stopasgroup=false
stopsignal=CTRL_BREAK_EVENT
redirect_stderr=true
stdout_logfile=%(ENV_TMP)s\\test.log
stdout_logfile_maxbytes=1MB
stdout_logfile_backups=10

Everything seems to work fine if I launch supervisor from a terminal as the current user (I can stop/restart without error, and my python script gracefully exits as I expect), but when run as a windows service the above exception is raised when stopping/restarting test via supervisorctl (and the exception takes down the entire service). The only thing I can think of is that for some reason it doesn't like sending the signal to a process running under the SYSTEM user (which is what all of supervisor is running under for me when installed as a service).

Do you have any suggestions? I'd really like to be able to be able to catch the signal in my python script and gracefully exit when the process is stopped by supervisor, but everything I try doesn't seem to work. Don't understand what I'm doing wrong given #17 seemed to work at the time. I'm running from the latest version in the repository BTW.

alexsilva commented 3 years ago

Fix commit 41edde5cd Update and test.

philipstarkey commented 3 years ago

Doesn't seem to have changed much.

When supervisor run as service:

When supervisor run from terminal:

If I run my script from a terminal directly, the code does detect ctrl+c.

philipstarkey commented 3 years ago

I'm attaching a test Python script (to be managed by supervisor) that reproduces the issue for me

import signal
import time
import sys
import os
import threading

print('child: starting')
old_handlers = {}
e = threading.Event()
def handler(signum, frame):
    print('child: Signal handler called with signal', signum)
    print(f'child: {threading.current_thread().name}')
    sys.stdout.flush()
    # print(old_handlers)

    time.sleep(0.5)
    # first time handler is called, set the event so the process ends nicely
    if not e.is_set():
        e.set()
    # If the handler is called again, call the old handler if it exists and is callable
    elif signum in old_handlers and callable(old_handlers[signum]):
        return old_handlers[signum](signum, frame)
    # If the handler is not callable, restore the old handler (one of the Python defaults) and retrigger the signal
    # so that the default handler is called and the default signal behaviour replicated.
    else:
        signal.signal(signum, old_handlers[signum])
        os.kill(os.getpid(), signum)

for s in [signal.SIGTERM, signal.SIGINT, signal.SIGBREAK]:
    old_handlers[s] = signal.signal(s, handler)
print('child: ', old_handlers)
print('child: handlers set up. Waiting for termination')
sys.stdout.flush()
while not e.is_set():
    # print('a')
    time.sleep(.1)
print('child: exiting')
sys.stdout.flush()

I've also tried launching supervisor as a service running under my local account (by changing the details in the logon tab for the supervisor service properties in the Windows services application) which did not help either. I also tried giving the local system account the ability to interact with the desktop (same tab of the service properties) and it also did not change anything. Not sure if this helps diagnose the issue or not.

alexsilva commented 3 years ago

It seems that the solution to this problem is here. Invalid-handle-issue It works when expected when a console is created.

Subprocesses created with creationflags=subprocess.CREATE_NEW_PROCESS_GROUP lose the ability to receive the CTRL_C_EVENT signal and the solution to this problem is the subprocess executing this command at startup.

import win32api
win32api.SetConsoleCtrlHandler(None, False)

Update and test.

philipstarkey commented 3 years ago

Thanks! This seems to solve the issue.

CTRL_BREAK_EVENT works with no changes to my subprocess. CTRL_C_EVENT works with the addition of the call to SetConsoleCtrlHandler to the subprocess. I believe this is expected based on what I've read over the last few days and the link your provided.

I did run into an issue where the service failed to start. Couldn't see exactly what was going wrong because the crash happened before the service logger was created (and so there was no info on the crash in any of the log files including the service log file). But it seems like sys.stdout was None and so the call to isatty() failed. I've made a PR with a fix.

alexsilva commented 3 years ago

Thanks for the detailed explanation. It took a lot of work to solve the problem but I'm glad it worked. I think we can close it because it has already been resolved.