benfred / py-spy

Sampling profiler for Python programs
MIT License
12.16k stars 401 forks source link

Correctly parse libpython path from /proc/pid/maps for chrooted processes #562

Open Jongy opened 1 year ago

Jongy commented 1 year ago

Closes: https://github.com/benfred/py-spy/issues/553

See my analysis in https://github.com/benfred/py-spy/issues/553#issuecomment-1455729122. My understanding is that /proc/pid/root points to the root as the process sees it, while /proc/pid/maps give us paths as WE see it. We (py-spy) are not chrooted when py-spy runs outside the chroot, so we need to strip the chroot from the /proc/pid/maps value. As a generalization, we always strip the path, which in most cases is just / if the process is not chrooted.

I tested it with this application:

FROM python:3.8

RUN mkdir /new_root/
RUN cp -r /lib /new_root/lib
RUN cp -r /usr /new_root/usr
RUN cp -r /bin /new_root/bin
RUN cp -r /lib64 /new_root/lib64

CMD chroot /new_root /bin/bash -c 'LD_LIBRARY_PATH=/usr/local/lib exec /usr/local/bin/python3 -c "while 1: pass"'

Try to profile this Python app with py-spy from master, and you'll get Error: No such file or directory (os error 2) as seen in the ticket. Try with this branch and it works fine :)

Note that we need to fix only the path of libpython and not of python bin, because since #364 we access /proc/pid/exe directly without readlinking it, and it works whether chroot is in effect or not (I tested it as well, profiled a chrooted app that's not --enable-shared and it works fine on master).

I'm opening as draft because I still want to test a few more areas (for example, what if py-spy runs in the same chrooted container?) but overall this looks good :)