Open laur89 opened 1 month ago
I don't know much about Seafile, but I tried this
Install WsgiDAV
cd test_wsgidav
pipenv install wsgidav gunicorn
pipenv shell
Create a wsgidav.yaml file with the following content:
server: gunicorn
server_args:
workers: 5
host: 0.0.0.0
port: 8080
provider_mapping:
"/": "."
Run WsgiDAV
test_wsgidav) ➜ test_wsgidav wsgidav --auth anonymous
Using default configuration file: /Users/martin/prj/git/test_wsgidav/wsgidav.yaml
...
21:30:35.543 - INFO : Running WsgiDAV/4.3.3 gunicorn/23.0.0 Python/3.12.0 ...
[2024-10-07 21:30:35 +0200] [70339] [INFO] Starting gunicorn 23.0.0
[2024-10-07 21:30:35 +0200] [70339] [INFO] Listening at: http://0.0.0.0:8080 (70339)
[2024-10-07 21:30:35 +0200] [70339] [INFO] Using worker: gthread
[2024-10-07 21:30:35 +0200] [70342] [INFO] Booting worker with pid: 70342
[2024-10-07 21:30:35 +0200] [70343] [INFO] Booting worker with pid: 70343
[2024-10-07 21:30:35 +0200] [70344] [INFO] Booting worker with pid: 70344
[2024-10-07 21:30:35 +0200] [70345] [INFO] Booting worker with pid: 70345
[2024-10-07 21:30:35 +0200] [70346] [INFO] Booting worker with pid: 70346
We can see that gunicorn starts five other processes, as configured.
Then open a second terminal and find the processes i.e. not the spawned process 51684:
➜ test_wsgidav ps -ef | grep wsgidav
501 70339 70173 0 9:30pm ttys003 0:00.21 /Library/Frameworks/Python.framework/Versions/3.12/Resources/Python.app/Contents/MacOS/Python /Users/martin/prj/git/test_wsgidav/.venv/bin/wsgidav --auth anonymous
501 70342 70339 0 9:30pm ttys003 0:00.12 /Library/Frameworks/Python.framework/Versions/3.12/Resources/Python.app/Contents/MacOS/Python /Users/martin/prj/git/test_wsgidav/.venv/bin/wsgidav --auth anonymous
501 70343 70339 0 9:30pm ttys003 0:00.11 /Library/Frameworks/Python.framework/Versions/3.12/Resources/Python.app/Contents/MacOS/Python /Users/martin/prj/git/test_wsgidav/.venv/bin/wsgidav --auth anonymous
501 70344 70339 0 9:30pm ttys003 0:00.12 /Library/Frameworks/Python.framework/Versions/3.12/Resources/Python.app/Contents/MacOS/Python /Users/martin/prj/git/test_wsgidav/.venv/bin/wsgidav --auth anonymous
501 70345 70339 0 9:30pm ttys003 0:00.12 /Library/Frameworks/Python.framework/Versions/3.12/Resources/Python.app/Contents/MacOS/Python /Users/martin/prj/git/test_wsgidav/.venv/bin/wsgidav --auth anonymous
501 70346 70339 0 9:30pm ttys003 0:00.13 /Library/Frameworks/Python.framework/Versions/3.12/Resources/Python.app/Contents/MacOS/Python /Users/martin/prj/git/test_wsgidav/.venv/bin/wsgidav --auth anonymous
now and stop the root process with SIGINT:
kill -s INT 70339
In the main terminal we see that the spawned processes are also stopped:
...
[2024-10-07 21:36:28 +0200] [70339] [INFO] Handling signal: int
[2024-10-07 21:36:28 +0200] [70342] [INFO] Worker exiting (pid: 70342)
[2024-10-07 21:36:28 +0200] [70343] [INFO] Worker exiting (pid: 70343)
[2024-10-07 21:36:28 +0200] [70344] [INFO] Worker exiting (pid: 70344)
[2024-10-07 21:36:28 +0200] [70345] [INFO] Worker exiting (pid: 70345)
[2024-10-07 21:36:28 +0200] [70346] [INFO] Worker exiting (pid: 70346)
[2024-10-07 21:36:28 +0200] [70339] [INFO] Shutting down: Master
➜ test_wsgidav
So it looks like it is working as expected?
Thanks for getting back so quick.
Looks like SIGINT
works even with those hanging gunicorn processes. Note in original post I described how SIGTERM
does nothing, but replacing it for SIGINT
does the trick:
root@1d53611e14f4:/seafile# ps -ef | grep -v grep | grep wsgidav
root 1404 1 0 16:53 ? 00:00:00 /usr/bin/python3 -m wsgidav.server.server_cli --server gunicorn --root / --log-file /seafile/logs/seafdav.log --pid /seafile/pids/seafdav.pid --port 8080 --host 0.0.0.0
root 1405 1 0 16:53 ? 00:00:00 /usr/bin/python3 -m wsgidav.server.server_cli --server gunicorn --root / --log-file /seafile/logs/seafdav.log --pid /seafile/pids/seafdav.pid --port 8080 --host 0.0.0.0
root 1406 1 0 16:53 ? 00:00:00 /usr/bin/python3 -m wsgidav.server.server_cli --server gunicorn --root / --log-file /seafile/logs/seafdav.log --pid /seafile/pids/seafdav.pid --port 8080 --host 0.0.0.0
root 1407 1 0 16:53 ? 00:00:00 /usr/bin/python3 -m wsgidav.server.server_cli --server gunicorn --root / --log-file /seafile/logs/seafdav.log --pid /seafile/pids/seafdav.pid --port 8080 --host 0.0.0.0
root 1408 1 0 16:53 ? 00:00:00 /usr/bin/python3 -m wsgidav.server.server_cli --server gunicorn --root / --log-file /seafile/logs/seafdav.log --pid /seafile/pids/seafdav.pid --port 8080 --host 0.0.0.0
root@1d53611e14f4:/seafile# pkill --signal SIGINT -f 'wsgidav.server.server_cli'
root@1d53611e14f4:/seafile# echo $?
0
root@1d53611e14f4:/seafile# ps -ef | grep -v grep | grep wsgidav
Is it possibly due to gunicorn itself handling INT, but not TERM signals?
At any rate, think I'll propose Seafile team to:
Although INT is a bit weird signal to send in this case, as afaik it's supposed to be keyboard/user interrupt, i.e. implies interactivity, not one system interrupting another.
Worth noting following your example using version 4.3.3
I'm unable to reproduce the conditions where some child processes hang around. Killing via both TERM & KILL signals always result in all processes being reaped. Unsure what's going on under Seafile.
Describe the bug Note this is quite possibly not an issue with wsgidav itself, but seafdav - seafile project's webdav implementation that relies on wsgidav.
There are cases where upon shutting down the service wsgidav child processes still hang around, causing subsequent restart of seafile to fail. It seems to happen only if webdav server has actually been used prior to stopping. If service is merely started and immediately stopped, all processes appear to shut down OK.
Looking at seafile codebase, it appears the wsgidav process is started like this:
...and stopped like this:
Note they're sending SIGKILL, so not quite sure why any process would remain hanging at all. Although unsure why SIGKILL is sent as the default signal in the first place.
To Reproduce
$ seafile.sh stop
Expected behavior All processes spawned by seafile, including wsgidav ones, should be shut down.
Environment:
Additional context/longer repro example
After starting seafile, this can be seen in seafile-controller (that's spawning wsgidav process) log:
These are the spawned wsgidav processes as seen from the running container (note pid
159
is tracked by seafdav as service pid):Now webdav server was used by an Android client, some I/O was performed.
Stopping the seafile server is done via a shell-script. From what is relevant, it performs two steps:
This signal is caught by the signal handler, which in turn sends SIGKILL to wsgi process (in this case, that'd be to PID
159
)Excerpt from relevant location of said shell-script (sry, cannot find the seafile repo that contains this script:
After this following 4 processes still remain hanging about:
I suppose my question is whether this is expected and is the wsgidav service shutdown performed correctly by seafile? Trying to kill the processes via another SIGTERM (i.e. default signal sent by
pkill
) does nothing, yet sending SIGKILL or SIGHUP appears to get rid of 'em:No idea what's up with that or whether it's safe to do so. Grepped
wsgidav
codebase and cannot find any signal handlers whatsoever, so no idea why SIGHUP works.My guess would be the issue is that the SIGKILL sent by the controller is targeted at the parent process, so it doesn't have a chance to gracefully shut down the child processes. But that's just a speculation.Nope that's not it. Sending SIGTERM to just the parent process only causes one of the child (!) processes to be nuked:Note PID
5476
(the parent process launched by controller) is still running, only5481
got killed.