Closed brndd closed 3 months ago
OK, let's start with the small stuff.
gdb looks to work fine. I can't get it to auto-pause at the main symbol like edb does, but both running new processes and attaching to existing ones, and then pausing and resuming them seems to work fine. I'm not getting any SELinux denials and tried with setenforce 0
too which made no difference.
With an already running process, here's what happens when building master in the debug configuration:
edb
This is both odd and interesting. I can't say that I've encountered that kind of issue before. Initially, I was wondering if somehow the setting of the initial breakpoint was the problem because looking at the debugee stack trace:
Module echo from rpm coreutils-9.3-5.fc39.x86_64
Stack trace of thread 1146806:
#0 0x0000560bfa4a14e1 main (echo + 0x24e1)
#1 0x00007ff95f99514a __libc_start_call_main (libc.so.6 + 0x2814a)
#2 0x00007ff95f99520b __libc_start_main@@GLIBC_2.34 (libc.so.6 + 0x2820b)
#3 0x0000560bfa4a2035 _start (echo + 0x3035)
ELF object binary architecture: AMD x86-64
It seems like it broke in main itself.
But that wouldn't explain why edb gets detached from already running processes it tries to resume... Perhaps I have it backwards?
I'm currently thinking that for some reason resuming is detaching edb from the process. The process then hits the breakpoint, but there's no debugger attached anymore so it just crashes as if it hit an int3
in code naturally.
Here's an experiment we can do.
Expected outcome: edb detaches kate crashes
Ran that experiment, and yup -- the behaviour matched your expected outcome exactly. Kate crashed, and edb got detached.
Journal output including stacktrace below, in case you want to check it for clues (I don't see anything that looks useful).
edit: also, I forgot to mention the kernel I'm running. It's 6.7.9.
OK, seems like we're learning something, but it is definitely a curious mystery. The main thing I'm not understanding at the moment is that your core dump says: "The crashed process was ptraced - not saving the crash"
So that makes it sound like the OS still thinks that a debugger was attached. Very strange. I'll have to think on it. But also may just install Fedora 39 in a VM to see if I can replicate it.
Anything non-stock about your configuration?
Anything non-stock about your configuration?
Nothing in particular besides this being a somewhat old installation (dating back to Fedora 33 or something). Oh, and last I used edb a year or so back, it worked fine. But that was version 1.2.0, and I couldn't get it to compile with modern Fedora packages. I use the KDE spin, though I really wouldn't expect that to matter.
I will also try replicating this in a VM. Will report back.
OK, could not reproduce this in a VM. I did notice that Kate was a bit crash-happy in the VM too, presumably because of the many threads it uses, so I used nano (running in a separate terminal) for debugging instead.
On my host machine, nano makes for a consistent repro:
nano
in itIt doesn't crash in the VM using the same build of edb (from this COPR). It also doesn't crash on my laptop which similarly runs Fedora 39 (though is slightly out of date; I'll edit this space after seeing if it still works once I've updated the laptop still works after updating).
To my understanding this would mean that edb managed to set a breakpoint but did not manage to catch it, so the program dies due to the uncaught signal. But curiously in this case the system journal still displays the line about the process being ptraced. And it's also weird that gdb works perfectly fine on my PC, even doing the same actions on the same executable.
Running cat /proc/$(pgrep nano)/status
after step 3 above shows that edb's pid is attached to the process (TracerPid
), and the process State
is t
(tracer stop). So edb is getting attached, but for some reason the tracer stop signal goes to the debuggee process...?
Very interesting, thanks for doing the experiments. I can say that one thing that I find puzzling is why it's only happening on a particular machine. I am worried that I won't be able to reproduce it properly.
One thing that's worth checking as well. Is it possible that there is some kind of plugin mismatch?
As in, a system installation, or even a previous one in a different location, but the settings are still pointing to the old plugins? I can imagine that would cause havok.
Can you try doing something like this:
Make sure edb is closed, then
mv ~/.config/codef00.com/edb.conf ~/.config/codef00.com/edb.conf.bak
if you aren't doing a system-wide install, make sure there are no lingering plugins in /usr/local/lib/edb
then re-run it so it generates a fresh configuration.
I had a look at the config file (didn't think about doing so before), and figured it out. And the cause was really stupid.
Under Preferences > Signals/Exceptions, I had ticked every signal, including SIGTRAP, to be ignored (passed to debuggee). Or presumably it was me; I only vaguely remember possibly having done this, but I don't think these would tick themselves so who else can I blame...
So edb was doing exactly what it was configured to do and passing SIGTRAP to the debugged program, which, being unhandled, would crash it. I unchecked all the boxes there and now debugging works again. :man_facepalming:
Thanks for the help with the troubleshooting and sorry about the noise.
LOL, no worries at all, now if someone else has a similar issue, hopefully they can find this issue.
Maybe we should add a warning about SIGTRAP if it's checked ;-)
OS: Fedora 39, KDE Wayland, kernel 6.7.9 edb version: tried 1.4.0, 1.5.0 and master
Attempting to open any binary with edb causes the process to immediately segfault when resumed. When attaching to existing processes, edb seems to get detached from the process as soon as the process is resumed.
Not really sure how to troubleshoot this further.
Reproduction steps
echo
ornano
When attaching to existing processes, debugging fails but the application doesn't seem to segfault.
Log output
Journal doesn't look to have anything interesting:
And neither does stdout for edb: