daniel5151 / gdbstub

An ergonomic, featureful, and easy-to-integrate implementation of the GDB Remote Serial Protocol in Rust (with no-compromises #![no_std] support)
Other
301 stars 49 forks source link

NoActiveThreads error when there are no active thread #122

Closed xobs closed 1 year ago

xobs commented 1 year ago

When no threads are reported as part of list_active_threads(), gdbstub throws an error gdbstub error during idle operation: NoActiveThreads. A comment notes that this shouldn't happen, but it appears as though it does.

Background

I've recently picked up gdbstub again and am working to integrate it into Xous more fully. It really is a fantastic project.

I have things mostly working now, and I'm filing off some of the rough edges. One of the rough edges is process selection. I have a monitor command to allow switching between processes, but the question arises of what process to connect to initially.

I support having None process, which is what I'd like to have selected initially. This coincides with a system that's fully running and does not halt when a debugger is attached.

However, returning early in list_active_threads() before adding any threads appears to result in this error.

daniel5151 commented 1 year ago

Hey, long time no see!

Glad to hear you're circling back to gdbstub again - I'm always excited to see the work you come up with running things in no_std.


A few immediate thoughts:

xobs commented 1 year ago

I'm traveling, so unfortunately I won't be able to do any extra work on this for one week. Unfortunately the debugger also exploded the night before I departed so there isn't any code posted online.

I'll have more concrete information after March 8th.

The sequence of events was that I connected to the target and then had my thread enumeration function return immediately - since there was no process selected and therefore no threads. I may have tried to continue execution, but I can't recall.

124 sounds suspiciously useful and applicable.

Processes in this system are composed of multiple threads. Currently I have a monitor command that prints out the process table, and if you specify a pid it will switch to debugging that process. Actually, I did get that working before I left so I can just link the code: https://github.com/betrusted-io/xous-core/blob/18a40111cb5a20be55b775959b2b4d49e1f3363d/kernel/src/debug/gdb.rs#L464

One thing about this approach is that you have to do "info thr" and select a thread after switching processes. It'd be nice if that wasn't required.

I haven't looked at extended mode since that seemed to be too closely related to posix and semi hosting and all that, but I can take another look.

On 1 March 2023 1:51:44 am SGT, Daniel Prilik @.***> wrote:

Hey, long time no see!

Glad to hear you're circling back to gdbstub again - I'm always excited to see the work you come up with running things in no_std.


A few immediate thoughts:

  • Can you share the trace output from gdbstub that led to this error?
  • If you're interested in properly supporting multi-process debugging, this might be a good opportunity to take a stab at #124 (something I've had as a TODO in the README for years now, but never opened a tracking issue for)
  • In the meantime... can you share some more context on how your custom monitor command works, and what your "processes" look like (i.e: do they support multiple threads each? or is each thread a process? etc...)
  • Have you explored the ExtendedMode extensions? From the GDB docs:
    • With target extended-remote mode: When the debugged program exits or you detach from it, GDB remains connected to the target, even though no program is running

  • Presumably, it's possible to connect to a stock gdbserver from gdb with no processes attached. I'd be interested to see the GDB RSP logs that occur during that connection. Could you help out with that investigation?
    • i.e: open gdb, run set debug remote 1 (to dump RSP packets gdb sends/recvs), and then connect to gdbserver

-- Reply to this email directly or view it on GitHub: https://github.com/daniel5151/gdbstub/issues/122#issuecomment-1448612699 You are receiving this because you authored the thread.

Message ID: @.***>

daniel5151 commented 1 year ago

All good, no rush or anything! Let me know once you're back and we can start tackling this thing.

I do strongly suspect that standing up proper multiprocess debugging is the real solution to your problems here. Aside from the ideological purity of it, it'd also fix your "info thr" problem, since GDB would natively understand when you're switching between processes.

That said - if that ends up becoming too daunting of a task, we can also dig into trying to mitigate the problem at hand (i.e: the "NoActiveThreads" error). Aside from seeing if ExtendedMode helps, there is also a (currently unimplemented) thread stop reason N that signals to GDB that there are no resumed threads left in the target.

It might be possible to mitigate this immediate issue by adding some logic to the initial ? response to query the current active thread count, and return N if no threads are live. That's just a theory of course - hence why I'm interested to see both your current GDB RSP logs, and those from a stock gdbstub exhibiting the desired behavior.

xobs commented 1 year ago

Here is the output from gdb when I don't create a fake thread:

(gdb) tar ext :3456
Remote debugging using :3456
Sending packet: $qSupported:multiprocess+;swbreak+;hwbreak+;qRelocInsn+;fork-events+;vfork-events+;exec-events+;vContSupported+;QThreadEvents+;no-resumed+#df...Ack
Packet received: PacketSize=1000;vContSupported+;multiprocess+;QStartNoAckMode+;swbreak+;qXfer:features:read+
Packet qSupported (supported-packets) is supported
Sending packet: $vMustReplyEmpty#3a...Ack
Packet received:
Sending packet: $QStartNoAckMode#b0...Ack
Packet received: OK
Sending packet: $!#21...Packet received:
Sending packet: $Hgp0.0#ad...Timed out.
xobs commented 1 year ago

In contrast, gdbserver returns E01 when the server sends $Hgp0.0#ad:

  [remote] Sending packet: $Hgp0.0#ad
  [remote] Packet received: E01
  [remote] Sending packet: $qXfer:features:read:target.xml:0,1000#0c
  [remote] Packet received: E01
  [remote] Sending packet: $qXfer:auxv:read::0,1000#6b
  [remote] Packet received: E01
  [remote] Sending packet: $QNonStop:0#8c
  [remote] Packet received: OK
  [remote] Sending packet: $qTStatus#49
  [remote] Packet received: T0;tnotrun:0;tframes:0;tcreated:0;tfree:500000;tsize:500000;circular:0;disconn:0;starttime:0;stoptime:0;username:;notes::
  [remote] packet_ok: Packet qTStatus (trace-status) is supported
  [remote] Sending packet: $qTfV#81
  [remote] Packet received: 1:0:1:74726163655f74696d657374616d70
  [remote] Sending packet: $qTsV#8e
  [remote] Packet received: l
  [remote] Sending packet: $?#3f
  [remote] Packet received: W00
daniel5151 commented 1 year ago

Fascinating... I gather two important observations from these logs:

Looking through the code, I believe this should be a relatively straightforward thing to fix:

...and I think that's it really.

The gdbserver also returns E01 for qXfer queries as well... but I don't think that's something to worry about just yet*. I could be wrong though - we'll see...

Can you make those tweaks, and see if that fixes things? I'd do it myself, but as it happens, i'm actually on vacation for the next ~1.5 weeks, so idk how much hands-on coding time I'll have 😅


* I suspect it'll only start to matter if gdbstub ever supports multi-process, multi arch debugging. i.e: IIRC, gdbserver on x64 platforms can multi-process debug x64 and 32-bit x86 processes at the same time. But that's something that's not really on gdbstub's radar in the near future, so it's fine to not worry about it for now...

xobs commented 1 year ago

No hurry on this, either. I'm not yet familiar enough with the inner workings to know where to patch it. And if we wait until you're back, then you don't have to think about this on your holiday.

daniel5151 commented 1 year ago

Oh, don't worry about that - gdbstub is my personal side project, so I'm more than happy to reply to issues and review code! In fact, there's a non-zero chance I'll find some downtime in the coming week or so to swing by a cafe and write some code myself...

That said, If you would like to lend a hand and take a stab at this fix yourself: the Rust compiler will totally enumerate all "downstream" modification points once get_any_sane_tid has been tweaked to return an Option<Tid>. i.e: after executing on that first bullet point, the compiler will point you at the exact code to change in order to close out any remaining bullet points.

xobs commented 1 year ago

I went ahead and made the change, and it does seem to fix the case where there are no threads. Ergonomics is better, as gdb now won't let you do anything until you select a process and list threads:

(gdb) tar ext :3456
Remote debugging using :3456
(gdb) info thr
No threads.
(gdb) c
The program is not being run.
(gdb) mon pr
Available processes:
   1   kernel
   2   xous-ticktimer
   3   xous-log
   4   xous-names
   5   xous-susres
   6   graphics-server
   7   keyboard
   8   spinor
   9   llio
  10   com
  11   net
  12   dns
  13   gam
  14   ime-frontend
  15   ime-plugin-shell
  16   codec
  17   modals
  18   root-keys
  19   trng
  20   sha2
  21   engine-25519
  22   jtag
  23   status
  24   shellchat
  25   pddb
  26   usb-device-xous
(gdb) mon pr 2
Now debugging PID 2
(gdb) c
The program is not being run.
(gdb) info thr
warning: No executable has been specified and target does not support
determining executable automatically.  Try using the "file" command.
[New Thread 1.2]
[New Thread 1.3]
  Id   Target Id         Frame
* 1    Thread 1.2        0x000331c8 in ?? ()
  2    Thread 1.3        0x000345b2 in ?? ()
(gdb) thr 3
Unknown thread 3.
(gdb) thr 2
[Switching to thread 2 (Thread 1.3)]
#0  0x000345b2 in ?? ()
(gdb) c
Continuing.

(There's still a bit of an oddity in that the thread IDs don't match up with the target IDs, but that's definitely a different issue.)