Closed rkooo567 closed 1 year ago
Also, I tried download lldb and tried. When I used that, it hangs instead of failing immediately like gdb
oh also, I think this environment (it is our CI) may use dind (docker in docker).
Hm. It's not a problem that it couldn't find malloc.c
. It doesn't need it, and that shouldn't make it fail.
But, there should be more output after that. When I follow your reproduction steps above, sometimes everything works fine, and the output shows:
Thread 29 "worker.io" hit Breakpoint 1, __GI___libc_malloc (bytes=88) at malloc.c:3023
3023 malloc.c: No such file or directory.
$8 = (void *) 0x7f877003ce50
$9 = 0x0
[New Thread 0x7f865bfff700 (LWP 7751)]
$10 = "SUCCESS"
Other times it fails, with output exactly like you show. It looks like the problem is in gdb. The gdb script that we're running executes this command:
commands 1 2 3 4 5 6 7 8
Which the help built in to the gdb in Focal says is supported syntax!
(gdb) help commands
Set commands to be executed when the given breakpoints are hit.
Give a space-separated breakpoint list as argument after "commands".
A list element can be a breakpoint number (e.g. `5') or a range of numbers
(e.g. `5-7').
With no argument, the targeted breakpoint is the last one set.
The commands themselves follow starting on the next line.
Type a line containing "end" to indicate the end of them.
Give "silent" as the first line to make the breakpoint silent;
then no output is printed when it is hit, except what the commands print.
Despite what the GNU gdb (Ubuntu 9.2-0ubuntu1~20.04.1) 9.2 manual says, the space-separated list is not actually working, and the commands are only getting associated with the first breakpoint, and not the other 7. If I change that line to breakpoint 1-8
, things seem to work properly, though I still see an intermittent issue with reporting stack traces that I need to dig into further...
If you want to try that change out yourself, the relevant file is memray/commands/_attach.gdb
The lldb one is even weirder. The debugger is exiting without resuming the process that it was attached to. If you manually resume the process by sending it a SIGCONT signal, everything works just fine.
That workaround seems horrific, and this feels like it must be an lldb bug (possibly unique to --batch
mode?), but I haven't yet managed to find any other reports of it...
I've entered https://github.com/llvm/llvm-project/issues/60408 for what appears to be an lldb bug explaining the hang you get with lldb.
@godlygeek thanks for looking into the issue! Oddly, I found the issue never (or it probably rarely) occurs when I tested on my Mac OS (not sure if this can help your debugging though). Maybe GDB commands
work different between mac and linux?
If I change that line to breakpoint 1-8, things seem to work properly, though I still see an intermittent issue with reporting stack traces that I need to dig into further...
Sounds good! Let me know when the fix is merged! I'd love to try it out! We are experimenting to introduce memray as a runtime memory profiling of ray (https://github.com/ray-project/ray/), and this issue seems to be the biggest blocker because I cannot run tests in the CI at all now because of this issue :(
I've confirmed that the gdb behavior is a bug - and it's one that was fixed upstream a few years back, by da1df1d - but of course, Focal is a few years old now, too, so it doesn't have that fix.
It seems that the workaround of using a range instead of a list of breakpoint numbers happens to work mostly by chance, but it does work, because of the different way in which ranges of breakpoint numbers are processed.
Awesome! Will this fix be included in the next release? When do you guys plan to have one?
I will also try verifying it with my environment. Can you guide me how I can download the latest master wheel? (or should I wait for next release?)
Hey @rkooo567 - this will be included in the next release, which I hope will happen within the next few weeks. We don't have wheels of master, but since this change is in a text file, you can just patch it into the latest Memray wheel yourself:
$ python3.7 -m pip download memray --no-deps
$ unzip ./memray-1.6.0-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
$ sed -i 's/1 2 3 4 5 6 7 8/1-8/' memray/commands/_attach.gdb
$ zip -r memray-1.6.0-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl */
That asks pip
to download the wheel for the last release (this is Python version specific, so since you used Python 3.7 in your bug report that's what I used here), then it unzips the wheel (wheels are just zip files in disguise), uses sed
to fix up the GDB script, and zips the wheel back up.
If you feel like testing that, I'd love confirmation that it works for you, but if not, I anticipate a release with the fix should be coming pretty soon, once we finish a few other features that we're working on finalizing.
(Or instead of modifying the wheel, you can of course just install the wheel and then use sed
to just patch the gdb script after it's been installed!)
I will probably try this out next week! Looking forward to the next release
The 1.7.0 release from last week includes this fix.
Let us know if you're still seeing any issues with that version, @rkooo567!
Is there an existing issue for this?
Current Behavior
When I ran
$(which memray) attach -o /tmp/pytest-of-root/pytest-5/test_memory_profiler0/profile/3009_memory_profiling.bin --verbose 3803
inside a ubuntu container, it fails with the following error.Expected Behavior
It should attach successfully (It worked on my local Mac).
Steps To Reproduce
python 3.7.9
pip install ray
Runray start --head
python test.py
ray list actors
and get the pid. Note that ray actor is just a python process.$(which memray) attach -o /tmp/pytest-of-root/pytest-5/test_memory_profiler0/profile/3009_memory_profiling.bin --verbose
We use the container based off of ubuntu:focal
Some deps we downloaded;
Memray Version
1.6.0. But I also tried 1.5.0
Python Version
3.7
Operative System
Linux
Anything else?
No response