bloomberg / memray

Memray is a memory profiler for Python
https://bloomberg.github.io/memray/
Apache License 2.0
13.17k stars 392 forks source link

Override dlsym instead of dlopen to correctly honour RPATH/RUNPATHS #525

Closed pablogsal closed 7 months ago

pablogsal commented 8 months ago

We need to override dlopen/dlclose to account for new shared libraries being loaded in the process memory space. This is needed so we can correctly track allocations in those libraries by overriding their PLT entries and also so we can properly map the addresses of the symbols in those libraries when we resolve later native traces. Unfortunately, we can't just override dlopen directly because of the following edge case: when a shared library dlopen's another by name (e.g. dlopen("libfoo.so")), the dlopen call will honor the RPATH/RUNPATH of the calling library if it's set. Some libraries set an RPATH/RUNPATH based on $ORIGIN (the path of the calling library) to load dependencies from a relative directory based on the location of the calling library. This means that if we override dlopen, we'll end up loading the library from the wrong path or more likely, not loading it at all because the dynamic loader will think the memray extenion it's the calling library and the RPATH of the real calling library will not be honoured.

To work around this, we override dlsym instead and override the symbols in the loaded libraries only the first time we have seen a handle passed to dlsym. This works because for a symbol from a given dlopen-ed library to appear in a call stack, something from that library has to be dlsym-ed first. The only exception to this are static initializers, but we cannot track those anyway by overriding dlopen as they run within the dlopen call itself.

Closes: #212

codecov-commenter commented 8 months ago

Codecov Report

Attention: 3 lines in your changes are missing coverage. Please review.

Comparison is base (41248ed) 92.55% compared to head (422a0ee) 92.86%. Report is 6 commits behind head on main.

Files Patch % Lines
tests/integration/test_extensions.py 90.47% 2 Missing :warning:
tests/conftest.py 75.00% 1 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #525 +/- ## ========================================== + Coverage 92.55% 92.86% +0.30% ========================================== Files 91 91 Lines 11304 11136 -168 Branches 1581 2026 +445 ========================================== - Hits 10462 10341 -121 + Misses 837 795 -42 + Partials 5 0 -5 ``` | [Flag](https://app.codecov.io/gh/bloomberg/memray/pull/525/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=bloomberg) | Coverage Δ | | |---|---|---| | [cpp](https://app.codecov.io/gh/bloomberg/memray/pull/525/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=bloomberg) | `92.86% <95.16%> (+6.92%)` | :arrow_up: | | [python_and_cython](https://app.codecov.io/gh/bloomberg/memray/pull/525/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=bloomberg) | `92.86% <95.16%> (-2.86%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=bloomberg#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

godlygeek commented 8 months ago

Hm. What I pushed seems to work on glibc Linux and macOS, but I think it's hanging on musl linux. I'll probably need to spin up a container to debug that. 😔