Open TempestoGuy opened 1 month ago
Thank you for this report. I am sorry I am late getting to it.
I have had a good look at my shutdown code, and I might have fixed this for v594. I am very ignorant about how Docker works, and how the hydrus Docker package works, but it looks like Docker tries to shut the client process down with SIGTERM, and that is coming back two seconds later with exit status 1 (i.e. something went wrong during the shutdown). I suspect my SIGTERM exit was touching something from the wrong thread. I don't have a Docker instance (or expertise) to test this properly, and my dev machine is Windows, where SIGTERM is a whole different kettle of fish, so we may need to do a couple of rounds of back-and-forth to fully nail this down.
In any case, I have cleaned up how I handle a SIGTERM to ensure it tries to shut things down as quickly as possible while still saving and cleaning everything it can, and handling the different thread interactions properly, and I cleaned some of my shutdown code generally. Please let me know how v594 goes for you, and if it does not, please check out the client log in install_dir?/db_dir/client - date.log
and see if there is a nice healthy traceback error in the shutdown section. A nice clean SIGTERM shutdown should just be three lines:
v593, 2024-10-14 15:59:16: doing fast shutdown…
v593, 2024-10-14 15:59:16: shutting down controller…
v593, 2024-10-14 15:59:16: hydrus client shut down
There are also some new 'self-sigterm' calls under help->debug->tests
that you might also like to play with, although I guess if you just sigterm the hydrus process, the Docker 'hydrusclient'(?) guy, who I assume is PID 1, will restart it.
I didn't know you couldn't nicely just go file->exit
on the program in the Docker package. I guess it the whole question of running hydrus in Docker is so hacky that it isn't a huge deal, but do you have any opinion on how better this could work? Or how it works in other Docker programs? Should I be trying to kill PID 1 or whatever? Should I remove the 'exit' menu entirely if we are in the Docker environment?
Thanks for getting back to me. I am sorry for my late response as well.
I tried v594 and v596 and it does shutdown correctly when I SIGTERM from the container's shell with kill -15 {hydrusclient's pid}
or with the new debug tests. it also doesn't restart hydrusclient
anymore.
v596, 2024-10-31 23:51:51: doing fast shutdown…
v596, 2024-10-31 23:51:51: shutting down controller…
v596, 2024-10-31 23:51:51: hydrus client shut down
and it doesn't restart hydrusclient when I use file->exit
anymore
hydrusclient | v596, 2024-11-01 02:37:24: saving and hiding gui…
hydrusclient | v596, 2024-11-01 02:37:25: shutting down gui…
hydrusclient | v596, 2024-11-01 02:37:25: waiting for managers to exit
hydrusclient | v596, 2024-11-01 02:37:26: waiting for workers to exit
hydrusclient | v596, 2024-11-01 02:37:26: waiting for idle shutdown work
hydrusclient | v596, 2024-11-01 02:37:26: waiting for services to exit
hydrusclient | v596, 2024-11-01 02:37:26: stopping services…
hydrusclient | v596, 2024-11-01 02:37:26: shutting down db…
hydrusclient | v596, 2024-11-01 02:37:26: saving and exiting objects
hydrusclient | v596, 2024-11-01 02:37:26: cleaning up…
hydrusclient | v596, 2024-11-01 02:37:26: shutting down controller…
hydrusclient | v596, 2024-11-01 02:37:26: hydrus client shut down
hydrusclient | 2024-11-01 02:37:32,571 INFO exited: hydrus (exit status 0; expected)
but it doesn't work when I shutdown from docker docker stop hydrusclient
although is says doing fast shutdown…
Did it just need more time?
maybe adding stopwaitsecs=30
to supervisord.conf could fix it.
hydrusclient | 2024-11-01 00:55:19,405 WARN received SIGTERM indicating exit request
hydrusclient | 2024-11-01 00:55:19,406 INFO waiting for fvwm, hydrus, novnc, vnc, xvfb to die
hydrusclient | caught XIO error:
hydrusclient | 01/11/2024 00:55:19 deleted 43 tile_row polling images.
hydrusclient | 2024-11-01 00:55:19,545 INFO stopped: xvfb (exit status 0)
hydrusclient | extra[1] signal: 15
hydrusclient | 2024-11-01 00:55:19,641 WARN exited: fvwm (exit status 1; not expected)
hydrusclient | 2024-11-01 00:55:19,641 INFO reaped unknown pid 39 (exit status 1)
hydrusclient | 2024-11-01 00:55:19,641 INFO reaped unknown pid 41 (exit status 0)
hydrusclient | 2024-11-01 00:55:19,641 INFO reaped unknown pid 43 (exit status 1)
hydrusclient | 2024-11-01 00:55:19,817 WARN stopped: vnc (exit status 3)
hydrusclient |
hydrusclient | Terminating WebSockets proxy (30)
hydrusclient | In exit
hydrusclient | 2024-11-01 00:55:19,819 WARN stopped: novnc (exit status 143)
hydrusclient | v596, 2024-11-01 00:55:20: doing fast shutdown…
hydrusclient | 2024-11-01 00:55:20,314 INFO reaped unknown pid 30 (exit status 0)
hydrusclient | 2024-11-01 00:55:22,879 WARN stopped: hydrus (exit status 1)
hydrusclient exited with code 0
docker docs say The main process inside the container will receive SIGTERM, and after a grace period, SIGKILL
and the main process PID 1
seems to be supervisord checked with top
.
and supervisord doesn't pass the signal unless specified in the supervisord.conf with stopsignal=Term
although it said here that the default signal is TERM
I don't think it passes it.
Or maybe supervisord somehow ignores the docker SIGTERM or maybe it isn't always PID 1.
I don't know if switching to another minimal init system will fix it, but some of them have signal forwarding to their children processes. here is an article I found with some pros and cons for different mini-init systems for docker containers.
the Docker 'hydrusclient'(?) guy
hydrusclient |
in my logs is just the container name. its just how docker compose displays logs since it could be multiple containers per docker-compose.yml
do you have any opinion on how better this could work? Or how it works in other Docker programs? Should I be trying to kill PID 1 or whatever?
I think this is special case since most containers I checked only had a single process that could accept the SIGTERM directly from docker since it was PID1 (although some processes may ignore signals if they are PID 1, I think its because linux treats PID 1 differently, but I am not sure what it does different),
Or they used tini init, from the quick reading I did it seems it can only control one process directly (this is probably wrong), but I think it could be configured to run a process group, and pass down the signal to the process group as a whole, but I haven't looked into it much.
I think using tini as PID 1 and supervisord as the child process of tini (since supervisord doesn't behave well when its PID 1 also not sure about this but I read it somewhere), this will also be the easiest to implement since it doesn't need changing much, but I don't know if it would work.
Its also implemented in docker so you can use it with just docker run --init
or docker compose init: true
and I tried it but it didn't work as expected and I don't know what is wrong maybe its because supervisord wasn't directly under it as PID 2 or something , or maybe it has to be added to the dockerfile and entrypoint.sh, but I don't know.
Hydrus version
v591
Qt major version
Qt 6
Operating system
Linux (specify distro and version in comments)
Install method
Third party (AUR, Docker, Chocolatey, etc. Specify in comments)
Install and OS comments
Host is a Debian 12 headless server Using the docker image of hydrusclient
Bug description and reproduction
When stopping Hydrus from docker it always exits incorrectly, and on the next Hydrus boot it says
Found and deleted the durable temporary database on boot. The last exit was probably not clean.
in the logs.And the client does that popup that says do you want to restore last session
So to exit correctly i have to exit the client from the GUI and then stop it from docker quickly before the supervisor restarts Hydrus again.
I also tried increasing the grace time in docker compose
stop_grace_period: 120s
, but that didn't work.My server is pretty weak so at first i thought it just didn't save the db and close the client fast enough, but I don't think the client is even getting a signal to start turning off.
Also thank you for your amazing work, Hydrus is amazing
Log output