mcollina / docker-loghose

Collect all the logs from all docker containers
MIT License
63 stars 15 forks source link

Logs from a stopping container are not caught #26

Closed jonathanperret closed 6 years ago

jonathanperret commented 6 years ago

Here's a container that logs some text when requested to stop:

$ docker run --name logs_on_exit  busybox sh -c 'trap "sleep 1; echo bye" TERM; sleep 1; echo hello; sleep 3600& wait'

If I run docker-loghose before starting this container, it correctly prints the hello message. However, if I then stop the container using e.g.:

$ docker stop logs_on_exit

Then docker-loghose does not print the bye message that the container wrote to its output, and which can be seen with docker logs logs_on_exit.

I expected docker-loghose to print all messages written by a container up until the moment it actually exits.

jonathanperret commented 6 years ago

It looks like the issue is in docker-allcontainers which reports a container as "stopped" as soon as it sees a container kill event (since https://github.com/mcollina/docker-allcontainers/pull/12 it appears).

I believe this behavior is incorrect: the container kill event only reports that a signal was sent to a container, which may or may not end up terminating the container.

I hope this helps.

megastef commented 6 years ago

Hi,

current docker versions might have changed events. This is what I get for Docker 17.09.0-c, when using docker rm -f:

docker events
2017-11-13T16:18:39.153808240+01:00 container kill 69d1c24c111ebc9e44b3116d11d9c7d76ea03f6e76ce921fb3c8797b7c715931 (image=redis, name=r1, signal=9)
2017-11-13T16:18:39.253118624+01:00 container die 69d1c24c111ebc9e44b3116d11d9c7d76ea03f6e76ce921fb3c8797b7c715931 (exitCode=137, image=redis, name=r1)
2017-11-13T16:18:39.470022281+01:00 network disconnect eb2343db8458151a1725d7452c994dc6395557012201ece23492ef1d417f4fbc (container=69d1c24c111ebc9e44b3116d11d9c7d76ea03f6e76ce921fb3c8797b7c715931, name=bridge, type=bridge)
2017-11-13T16:18:39.549154306+01:00 volume unmount 049e2ffe7300e5ff1b36390614e2d6557ba8d64f3c211bd53b44884fc814df81 (container=69d1c24c111ebc9e44b3116d11d9c7d76ea03f6e76ce921fb3c8797b7c715931, driver=local)
2017-11-13T16:18:39.566041095+01:00 container destroy 69d1c24c111ebc9e44b3116d11d9c7d76ea03f6e76ce921fb3c8797b7c715931 (image=redis, name=r1)

And for docker stop:

docker events
2017-11-13T16:20:56.516971741+01:00 container kill 713f6672c6fa390f909d16f0535f8228479ad431ae3df2e0f3f48186c81a1393 (image=redis, name=r1, signal=15)
2017-11-13T16:20:56.708373730+01:00 container die 713f6672c6fa390f909d16f0535f8228479ad431ae3df2e0f3f48186c81a1393 (exitCode=0, image=redis, name=r1)
2017-11-13T16:20:57.009236418+01:00 network disconnect eb2343db8458151a1725d7452c994dc6395557012201ece23492ef1d417f4fbc (container=713f6672c6fa390f909d16f0535f8228479ad431ae3df2e0f3f48186c81a1393, name=bridge, type=bridge)
2017-11-13T16:20:57.088038300+01:00 volume unmount 2f50d67775557175170627739382b5993c671a88998f2eecb4c22357efcf5e94 (container=713f6672c6fa390f909d16f0535f8228479ad431ae3df2e0f3f48186c81a1393, driver=local)
2017-11-13T16:20:57.102525097+01:00 container stop 713f6672c6fa390f909d16f0535f8228479ad431ae3df2e0f3f48186c81a1393 (image=redis, name=r1)

And docker kill:

2017-11-13T16:24:11.377068629+01:00 container kill 713f6672c6fa390f909d16f0535f8228479ad431ae3df2e0f3f48186c81a1393 (image=redis, name=r1, signal=9)
2017-11-13T16:24:11.470480303+01:00 container die 713f6672c6fa390f909d16f0535f8228479ad431ae3df2e0f3f48186c81a1393 (exitCode=137, image=redis, name=r1)
2017-11-13T16:24:11.699112387+01:00 network disconnect eb2343db8458151a1725d7452c994dc6395557012201ece23492ef1d417f4fbc (container=713f6672c6fa390f909d16f0535f8228479ad431ae3df2e0f3f48186c81a1393, name=bridge, type=bridge)
2017-11-13T16:24:11.797909805+01:00 volume unmount 2f50d67775557175170627739382b5993c671a88998f2eecb4c22357efcf5e94 (container=713f6672c6fa390f909d16f0535f8228479ad431ae3df2e0f3f48186c81a1393, driver=local)

Conclusion: listening to die event should be sufficient for recent docker versions.

megastef commented 6 years ago

Please see PR: https://github.com/mcollina/docker-allcontainers/pull/15

megastef commented 6 years ago

@jonathanperret is the issue solved for you, using the latest version of docker-loghose & docker-allcontainers?

megastef commented 6 years ago

I made some tests today. It looks we need to delay the detach operation. The following code did fix the issue:

events.on('stop', function (meta) {
    setTimeout(function () {
      detachContainer(meta.id)
    }, Number(process.env.LOGHOSE_WAIT_ON_EXIT) || 5000)
  })

I'm going to submit a PR ...

megastef commented 6 years ago

Fix released on NPM v1.6.3: https://www.npmjs.com/package/docker-loghose