Fuseki container is not exiting cleanly anymore

danmichaelo commented 3 years ago

Hi. Just updated jena-fuseki from 3.10.0 3.14.0 and noticed that signal handling no longer seems to be handled correctly. When stopping the container, Docker waits 10 seconds until eventually SIGKILL-ing it (exit code 137):

$ docker run --name fuseki-test -d stain/jena-fuseki:3.14.0

$ docker exec fuseki-test ps
  PID TTY          TIME CMD
   80 ?        00:00:00 ps
    1 ?        00:00:00 docker-entrypoi
   10 ?        00:00:06 java

$ time docker stop fuseki-test
fuseki-test

real    0m10.441s
user    0m0.121s
sys 0m0.066s

$ docker inspect fuseki-test --format='{{.State.ExitCode}}'
137

Here's how it used to work, with java running as PID 1:

$ docker run --name fuseki-test-3.10.0 -d stain/jena-fuseki:3.10.0
$ docker exec fuseki-test-3.10.0 ps
PID   USER     TIME   COMMAND
    1 root       0:03 /usr/lib/jvm/java-1.8-openjdk/jre/bin/java -Xmx1200M -jar /jena-fuseki/fuseki-server.jar
   39 root       0:00 ps

$ time docker stop fuseki-test-3.10.0
fuseki-test-3.10.0

real    0m0.988s
user    0m0.137s
sys 0m0.079s

$ docker inspect fuseki-test-3.10.0 --format='{{.State.ExitCode}}'
143

The cause seems to be this change. Docker sends the SIGTERM signal to PID 1, but PID 1 is now the docker-entrypoint script.

To me it seems the entrypoint script is doing to much, I don't think it should be the concern of the entrypoint script to wait for the server to be ready. @xgaia , would you be ok with replacing exec "$@" & with exec "$@" and removing everything after line 40 in the entrypoint script?

A possibility could be to add a HEALTCHECK instead. A simple version would be

HEALTHCHECK CMD curl -sS --fail 'http://localhost:3030/$/ping' || exit 1

This can also be used with docker-compose to ensure that dependants do not start before Fuseki is ready:

depends_on:
  fuseki:
    condition: service_healthy

kuzeko commented 3 years ago

HI @danmichaelo , I think the other reason for the exec "$@" & and the curl check is to then allow the call to initialize the dataset (see few lines below)

Isn't it?

namedgraph commented 3 years ago

I had attempted to do similar kinds of initializations within the entrypoint atomgraph/fuseki-docker and ended up with similar problems. Decided it's not worth it and got rid of the entrypoint entirely.

It's now the calling service that does the dataset initialization and it works fine.

kuzeko commented 3 years ago

I would be fine with this approach, but sounds like a major change here. If there is consensus we can proceed, but need to add documentation

stain commented 3 years ago

Apologies for not watching this repo properly.. Too many emails!

The reason for @ is to support creating empty datasets as well as to delay loading of datasets.

Perhaps this change can be done as part of update to Jena 4.0 with #54 as it would be a breaking change to remove this functionality.

Alternatively the entry-point script can branch depending on if it is in "simple" mode (exec "$@") or "create dataset" mode (exec "$@" &). But then it it is probably better to add a trap handler - or better - include something more mature like https://github.com/krallin/tini that can wrap our entry point and be PID 1.

stain commented 3 years ago

Using tini seems to work - Travis shuts down with a kind SIGINT - equivalent to a Ctrl-C. Ctrl-C locally also works

danmichaelo commented 3 years ago

Thanks, that should work! Reminds me there is also the --init flag when starting the container, but I prefer the approach of including tini in the image when needed, since it's really the design of the image that decides whether it's needed or not, and it only adds a few kilobytes.

stain / jena-docker

Fuseki container is not exiting cleanly anymore #51