docker-library / python

Docker Official Image packaging for Python
https://www.python.org/
MIT License
2.54k stars 1.07k forks source link

Debbuging a `-slim` image? #903

Open hterik opened 8 months ago

hterik commented 8 months ago

We are hitting a rare deadlock in production that can't be reproduced using debug images. Only way to debug it is to attach to the production image as the problem happens. Problem is that production are based on the -slim images.

apt install gdb and python3-dbg is not enough because the python running is built from source and not aligned with what is available in apt.

I've tried to start the corresponding non-slim image but their binaries don't seem to align enough to make gdb happy:

Start python inside -slim image

:arrow_right: My end goal is to run gdb py-bt on this process: docker run -ti python:3.11-slim-bookworm python3 -c "import time; time.sleep(1000)"

Inside this image there is no chance to debug at all due to missing gdb and debug symbols. This is expected.

Build a debugger image from the corresponding non-slim image.

FROM python:3.11-bookworm

RUN --mount=type=cache,target=/var/cache/apt --mount=type=cache,target=/var/lib/apt \
    apt update -y && \
    apt install -y gdb

docker build -t mydebugpython .

Use the newly built debug image to attach into the first running container

Find the container id and use that to attach to the same pid-namespace.

docker run \
    --cap-add SYS_PTRACE \
    --pid container:15aab7ea5f57 \
    --privileged  \
    -ti \
    --entrypoint bash docker.io/library/mydebugpython
root@bb90d04f8722:/# gdb -q python --pid 1 -ex "py-bt"

Reading symbols from python...
Attaching to program: /usr/local/bin/python, process 1

warning: Build ID mismatch between current exec-file /usr/local/bin/python
and automatically determined exec-file /usr/local/bin/python3.11
exec-file-mismatch handling is currently "ask"

Load new symbol table from "/usr/local/bin/python3.11"? (y or n) y
Reading symbols from /usr/local/bin/python3.11...
Reading symbols from target:/usr/local/bin/../lib/libpython3.11.so.1.0...
(No debugging symbols found in target:/usr/local/bin/../lib/libpython3.11.so.1.0)
Reading symbols from target:/lib/x86_64-linux-gnu/libc.so.6...
Reading symbols from /usr/lib/debug/.build-id/82/ce4e6e4ef08fa58a3535f7437bd3e592db5ac0.debug...
Reading symbols from target:/lib/x86_64-linux-gnu/libm.so.6...
Reading symbols from /usr/lib/debug/.build-id/ea/87e1b3daf095cd53f1f99ab34a88827eccce80.debug...
Reading symbols from target:/lib64/ld-linux-x86-64.so.2...
Reading symbols from /usr/lib/debug/.build-id/38/e7d4a67acf053c794b3b8094e6900b5163f37d.debug...
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
__GI___clock_nanosleep (clock_id=1, flags=1, req=0x7ffd9d9ec458, rem=0x0) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:71
71  ../sysdeps/unix/sysv/linux/clock_nanosleep.c: No such file or directory.
Traceback (most recent call first):
  (unable to read python frame information)    # <--------------   :(

So it seems like the difference of slim and non-slim isn't only the presence of debug symbols or not, but also the binary is built differently? Is this possible to solve somehow? My knowledge of how python is built and how to operate gdb are only moderate.

Luiz-Monad commented 7 months ago

This is similar to one problem I am having, their image doesn't have any debug symbols so I decided to add it.

This is the issue we were discussing. https://github.com/docker-library/python/issues/807 I had to do this to have the debug images: https://github.com/Luiz-Monad/docker-python/commit/fe752b6410d1defc6e42400120f2605bf9d6f3e7#diff-b3c0fcf605d397bd3fce8ea61588d2c9da8b24a9fe6886b2e640d97914f1d01f

Here is how you're supposed to use it https://github.com/docker-library/python/pull/701