llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.11k stars 12.01k forks source link

[lldb] LLDB sends a QStartNoAckMode packet to remote GDB servers even if they don't support it #64872

Closed PoroCYon closed 1 year ago

PoroCYon commented 1 year ago

This issue has come up when testing the GDB-protocol server implementation for melonDS. LLDB sends a QStartNoAckMode packet as the very first command sent (as shown in the log in the linked issue). This shouldn't happen: the client first needs to send a qSupported packet to check whether the server supports no-ack mode. Only if both do, the client may send the QStartNoAckMode packet to switch to no-ack mode (c.f. GDB documentation):

If the stub supports ‘QStartNoAckMode’ and prefers to operate in no-acknowledgment mode, it should report that to GDB by including ‘QStartNoAckMode+’ in its response to ‘qSupported’; see qSupported. If GDB also supports ‘QStartNoAckMode’ and it has not been disabled via the set remote noack-packet off command (see Remote Configuration), GDB may then send a ‘QStartNoAckMode’ packet to the stub. Only then may the stub actually turn off packet acknowledgments. GDB sends a final ‘+’ acknowledgment of the stub’s ‘OK’ response, which can be safely ignored by the stub.

llvmbot commented 1 year ago

@llvm/issue-subscribers-lldb

jasonmolenda commented 1 year ago

Unfortunately it's not that simple. lldb added QStartNoAckMode first, without a qSupported entry, and gdb added the packet but also added the requirement that it be listed in qSupported: https://sourceware.org/legacy-ml/gdb-patches/2008-07/msg00152.html

Honestly I don't know why you would require a qSupported feature advertised for this. The gdb remote serial protocol has the long established behavior that the remote stub responds with empty string to any packet it doesn't recognize. You should be able to send qStartNoAckMode to any stub, get back "", and understand that this mode is not supported. Does melonDS not behave this way? You're going to find lots of lldb-created packets that it sends to a remote stub to see if they are supported. I know lldb interops with many JTAG/random stubs that don't support noack mode, I'm sure we handle an empty response to this packet correctly, don't we.

We could improve our interop here by having debugserver and lldb-server include QStartNoAckMode+ in their list of supported features, but lldb needs to interop correctly with debugservers going back 4-5 years which do not advertise this feature; not enabling noack mode would be a huge performance regression when communicating with those servers.

PoroCYon commented 1 year ago

I see, that's a rather annoying situation.

melonDS does respond with an empty packet on an unknown command (a, b), though a disconnect does seem to happen anyway. On this commit of melonDS (which doesn't implement QStartNoAckMode), lldb seems to fail:

$ lldb 
(lldb) gdb-remote localhost:3333
error: failed to get reply to handshake packet within timeout of 0,0 seconds

melonDS log:

[GDB] initializing GDB stub for core 9 on port 3333
--- dataoff=0
[GDB] recv() 20 bytes: '+$QStartNoAckMode#b0' (2b)
[GDB] recv() after skipping: n=19, recv_total=19
[GDB] got pkt, checksum: b0 vs b0
[GDB] command in: 'QStartNoAckMode'
[GDB] unknown command 'Q'!
[GDB] send resp: '$#00'
--- dataoff=0

i.e. no other packet is received other than the QStartNoAckMode one on the melonDS side.

EDIT: if you want to test melonDS, you can go to its src/debug/gdb_test folder and run a mock version of the GDB protocol implementation, which doesn't have the rest of the emulator attached. (It's how I generated the above log.)

jasonmolenda commented 1 year ago

I see the problem. The gdb remote serial protocol communications begin with a + character from lldb/gdb and the remote stub replies with + - this is the initial handshake. lldb is sending the initial handshake + and then immediately sending the QStartNoAckMode packet before it gets the response to the handshake packet. The remote stub sees the handshake and the packet received at the same time, and seemingly drops the +. Any packet lldb sent here would probably be mishandled.

jasonmolenda commented 1 year ago

e.g.

(lldb) log enable gdb-remote packets

(lldb) r
 <   1> send packet: +
 history[1] tid=0x0103 <   1> send packet: +
 <  19> send packet: $QStartNoAckMode#b0
 <   1> read packet: +
 <   6> read packet: $OK#9a
 <   1> send packet: +
 <  86> send packet: $qSupported:xmlRegisters=i386,arm,mips,arc;multiprocess+;fork-events+;vfork-events+#2e
 <  48> read packet: $qXfer:features:read+;PacketSize=20000;qEcho+#00
PoroCYon commented 1 year ago

Ah, I see, that does indeed seem to be a mistake on my end. Sorry.

jasonmolenda commented 1 year ago

well to be fair, I don't know if lldb should be sending its first packet before it sees the handshake packet reply. I've never seen it be a problem with other stubs, but it doesn't mean it's correct.

jasonmolenda commented 1 year ago

I think my memory of how the gdb remote serial protocol works before ack mode is disabled might not be so great. If I connect to a debugserver with telnet and send just a + ACK character, it doesn't reply. Looking at the code in lldb,

bool GDBRemoteCommunicationClient::HandshakeWithServer(Status *error_ptr) {
  ResetDiscoverableSettings(false);

  // Start the read thread after we send the handshake ack since if we fail to
  // send the handshake ack, there is no reason to continue...
  std::chrono::steady_clock::time_point start_of_handshake =
      std::chrono::steady_clock::now();
  if (SendAck()) {
    // The return value from QueryNoAckModeSupported() is true if the packet
    // was sent and _any_ response (including UNIMPLEMENTED) was received), or
    // false if no response was received. This quickly tells us if we have a
    // live connection to a remote GDB server...
    if (QueryNoAckModeSupported()) {
      return true;
[...]

SendAck sends the + and then we send the $QStartNoAckMode#b0 packet. The remote stub replies with +$OK#9a -- the ACK response to the initial + character, and the OK response to QStartNoAckMode.

PoroCYon commented 1 year ago

The remote stub replies with +$OK#9a -- the ACK response to the initial + character, and the OK response to QStartNoAckMode.

Isn't the + in the response from the server supposed to be an acknowledgement that it has received the command packet correctly?

By default, when either the host or the target machine receives a packet, the first response expected is an acknowledgment: either ‘+’ (to indicate the package was received correctly) or ‘-’ (to request retransmission).

jasonmolenda commented 1 year ago

tbh I haven't worked with an ack-mode remote stub in fifteen years, I'm definitely not an expert. I tried building debugserver without noack mode support, and the initial communication looks like

 <   1> send packet: +
 <  19> send packet: $QStartNoAckMode#b0

 <   1> read packet: +
 <   4> read packet: $#00
 <   1> send packet: +

 <  86> send packet: $qSupported:xmlRegisters=i386,arm,mips,arc;multiprocess+;fork-events+;vfork-events+#2e
 <   1> read packet: +

 <  64> read packet: $qXfer:features:read+;PacketSize=20000;qEcho+;native-signals+#5e
 <   1> send packet: +

There is a one-off "lldb sends +, remote stub replies +" expectation. I don't know what to think about lldb sending its first real packet as soon as it's sent the initial handshake, before it get a reply to that. Normally gdb remote serial protocol is "send one packet, wait for a response", you don't send a packet before the response except the control-c interrupt packet (or stdout/stderr text from the inferior while it is running). But the very beginning of the communication is definitely a special case and I don't know if this behavior is considered correct. It may be that all the stubs lldb has interoperated with are more accepting of "initial handshake AND first packet received at once", or it may be that melonDS is not handling this behavior and it should be. I don't know which is correct.