just-containers / s6-overlay

s6 overlay for containers (includes execline, s6-linux-utils & a custom init)
Other
3.78k stars 212 forks source link

s6-pause: segfault #85

Closed bobelev closed 9 years ago

bobelev commented 9 years ago

I often get messages like this:

Sep 15 22:16:00 some-worker-15.09.15t18.47 docker[29106]: node-harmony exited 0
Sep 15 22:16:00 some-worker-15.09.15t18.47 docker[29106]: [cont-finish.d] executing container finish scripts...
Sep 15 22:16:00 some-worker-15.09.15t18.47 docker[29106]: [cont-finish.d] done.
Sep 15 22:16:00 some-worker-15.09.15t18.47 docker[29106]: [s6-finish] syncing disks.
Sep 15 22:16:00 some-worker-15.09.15t18.47 kernel: s6-pause[29159]: segfault at 0 ip 00000000004005d7 sp 00007ffcd581b980 error 4 in s6-pause[400000+2000]
Sep 15 22:16:00 some-worker-15.09.15t18.47 systemd-coredump[29848]: Failed to get EXE.
Sep 15 22:16:00 some-worker-15.09.15t18.47 docker[29106]: [s6-finish] sending all processes the TERM signal.
Sep 15 22:16:00 some-worker-15.09.15t18.47 systemd-coredump[29848]: Process 27 (kblockd) of user 0 dumped core.

I have node.js program that runs in docker container via systemd (coreos/fleet). It does some work and gracefully exits (node-harmony exited 0). In s6 layer there is only one cont-init.d script and nothing else. ExecStop and ExecStopPost in systemd don't run for this units. Unit fails with failed state because.

full log

s6-overlay: v1.14.0.4 docker version 1.7.1, build 2c2c52b-dirty base image: alpine:3.2 CoreOS beta: 766.3.0

glerchundi commented 9 years ago

hi @bobelev, this seems to be a s6-pause bug, can you obtain the strace log for s6-pause, so that we'll have more info to ask in the supervision mailing list?

Do this:

FROM ubuntu

# s6 overlay
ADD https://github.com/just-containers/s6-overlay/releases/download/v1.14.0.4/s6-overlay-amd64.tar.gz /tmp/s6-overlay.tar.gz
RUN tar xvfz /tmp/s6-overlay.tar.gz -C /

# strace
ADD http://landley.net/aboriginal/downloads/binaries/extras/strace-x86_64 /usr/bin/strace
RUN chmod +x /usr/bin/strace

# replace 's6-pause' with 'strace s6-pause'
RUN sed -i 's/s6-pause/redirfd -w 2 \/logs\/s6-pause-strace.log strace -v s6-pause/g' /etc/s6/init/init-stage2

ENTRYPOINT [ "/init" ]

And execute using this:

docker build -t asdf .
docker run -v `pwd`/logs:/logs -ti asdf

And after that send the full strace log again, maybe with that Laurent would be able to solve your problem.

glerchundi commented 9 years ago

@bobelev any update on this?

bobelev commented 9 years ago

@glerchundi, I rebuilt container to pass around this issue so I can't really reproduce it.

glerchundi commented 9 years ago

ok, reopen this issue if needed.