GDB hangs stepping with ARAnyM-JIT

freemint / m68k-atari-mint-binutils-gdb

Fork of sourceware's binutils-gdb with support for the m68k-atari-mint target.

https://github.com/freemint/m68k-atari-mint-binutils-gdb/wiki

GNU General Public License v2.0

11 stars 4 forks source link

GDB hangs stepping with ARAnyM-JIT #9

Open vinriviere opened 10 months ago

vinriviere commented 10 months ago

I use GDB to debug a simple program with ARAnyM-JIT on Windows. I start with "b main". Then "run. "n", Enter, Enter, Enter...

After a few steps:

GDB hangs, keystrokes do nothing
ARAnyM-JIT takes all the CPU on a single core. I can hear the fan spinning fast.
the text cursor still blinks

This hang doesn't happen with non-JIT ARAnyM.

@th-otto Do you think this is expected? When a breakpoint is set, or "n" is typed, GDB replaces the next location with an ILLEGAL instruction. Maybe such heavy dynamic code patching causes trouble to the JIT?

th-otto commented 10 months ago

Yes, i would say this is expected. Using the JIT version of aranym when running gdb will not work. Once the code has been compiled, patching the original m68k code with an illegal instruction will not be recognized.

Using all CPU time is also quite normal. The only places where aranym is throttled down, is when the program executes a STOP instruction. That happens e.g. with emutos in evnt_multi(), Bconin() etc., but not with other TOS.

vinriviere commented 10 months ago

OK thanks. So GDB+JIT is a no-go. Not a real problem, as long as it is documented. What was puzzling is that "n" worked a few times, then finally fails with hang. It would be nice that ARAnyM-JIT could detect such situation and display an error message instead of becoming bogus. Something like detecting writes to the TEXT segment after the initial relocation has occured. Or something like that.

When I speak about 100% CPU, I report what is happening during the faulty "n" command. Normally, "n" is very quick as it executes only one instruction. But in the case of the above JIT bug, "n" seems to cause an internal infinite loop.

So regarding to this issue, for now let's just say that GDB and ARAnyM-JIT are incompatible. Use standard non-JIT ARAnyM for debugging. We can live with that.

th-otto commented 10 months ago

What was puzzling is that "n" worked a few times, then finally fails with hang.

That is because aranym first executes the code using emulation, before it gets compiled. During that first runs, the code is analyzed for the compiler.

Something like detecting writes to the TEXT segment after the initial relocation has occured

Yes, have to check that. Theoretically, if GDB flushes the instruction cache after setting/removing a breakpoint (which must be done on real hardware anyway), the compiled code should be thrown away by aranym. But it still would not work reliably i think; the compiled code cannot report the exact instruction pc when reaching the breakpoint, and most likely cannot continue when gdb re-inserts the original instruction.

Would also be interesting to check whether GDB works with the new QEMU based emulator.

th-otto commented 6 months ago

A few more thoughts about that:

There is already an existing NF_CONFIG feature, that allows to disable/enable JIT mode. There are several possibilities to use this:
- we could add a patch to gdb to use this, prior to loading or starting the program
- we could add a patch to the mint kernel, that invokes this when a program is going to be traced.
- you could call an external tool that makes use of that feature (such a tool exists already)
The GUI of aranym could be changed so you can disable JIT without having to restart aranym
The JIT compiler of aranym could be changed to check the TRACE flag in the status register, and fall back to CPU emulation

I would actually prefer the last, but it might need some work. Big advantage: JIT could be disabled only while tracing the program, but be enabled for the rest of the system. I recently tried to debug scummvm, which needs to load & process ~190MB of debug info, and that is a real pain even on aranym without JIT

mikrosk commented 6 months ago

I'm also in favour of the last option, it sounds most bullet-proof to me.

th-otto commented 6 months ago

Problem with that is, that it might not be sufficient. If you just set a breakpoint, then the program will be run without the trace flag being set. Aranym would also have to catch the case that the breakpoint instruction is written to the code, and i have currently no idea how to achieve that cleanly.