Add support for responding with library relocation offsets

daniel5151 / gdbstub

An ergonomic, featureful, and easy-to-integrate implementation of the GDB Remote Serial Protocol in Rust (with no-compromises #![no_std] support)

Other

301 stars 48 forks source link

Add support for responding with library relocation offsets #20

Open mchesser opened 4 years ago

mchesser commented 4 years ago

For some targets, sections may be relocated from their base address. As a result, the stub may need to tell GDB the final section addresses to ensure that debug symbols are resolved correctly after relocation.

Depending on the target this can be done using several mechanisms:

For targets where library offsets are maintained externally (e.g. Windows) this can be done by responding to qXfer:library:read.
For System-V architectures, GDB is capable of extracting library offsets from memory if it knows the base address of the dynamic linker. The base address can be specified by either implementing the qOffsets command or by including a AT_BASE entry in the response to the more modern qXfer:auxv:read command. Alternatively, a target can implement qXfer:library-svr4:read, however this may involve digging into the internals of the dynamic linker.

Currently, only the qOffsets command has been implemented (see: #30).

Original issue:

For targets that relocate an image before execution, the debug symbols in GDB will have the incorrect offsets. The qOffsets query allows GDB to query the stub for the text segment offset after relocation.

I implemented the functionality I needed here: https://github.com/mchesser/gdbstub/commit/db347e0d9e1056549aaa7517938c608f27a32305 however there appears to be two different methods of reporting the offset (I think one is relative, and one is absolute), and I'm not entirely sure if I missed anything in the implementation

daniel5151 commented 4 years ago

Hmm, googling qOffsets brought up an issue on the rust-gdb-remote-protocol project which had some interesting discussion: https://github.com/luser/rust-gdb-remote-protocol/issues/11

It looks like the cleaner (and more modern?) way to address the underlying issue would be to implement qXfer:libraries:read (and/or, qXfer:memory-map:read). Moreover, unless I misread something in the linked discussion, it looks like gdb-server doesn't even implement qOffsets!

If it's not too much trouble, would you be willing to experiment with implementing one/both of these packets, and seeing if they work as intended?

Working with XML might be a bit clunky, but it shouldn't be too tricky to get things working. Oh, and you may need to add some logic to the qSupported response (i.e: sending ;qXfer:libraries:read+ and/or qXfer:memory-map:read).

P.S: What sort of project are you using gdbstub with? Just curious 😄

mchesser commented 4 years ago

qXfer:libraries:read should work fine for my use, I'll have a go at implemented it.

I know the gdb-stub in qemu does respond to qOffsets https://github.com/qemu/qemu/blob/25f6dc28a3a8dd231c2c092a0e65bd796353c769/gdbstub.c#L2076-L2089 -- However I looking through the git history it looks like that code was originally from about 14 years ago! so I'm happy to move the the more modern way of doing things.

(My project is an emulator that exposes a simple (also emulated) Linux userland -- I previously had my own gdb stub implementation (the protocol is a bit of a mess isn't it?), but it was much more limited than your gdbstub so I'm trying to switch over).

daniel5151 commented 4 years ago

Awesome, excited to see what you come up with!

And yeah, the protocol's a real mess, eh? ~Heck, it even requires all packets to be 7-bit ASCII~, which really goes to show you just how long it's been around for!

EDIT: ignore that 7-bit ASCII comment. Newer versions of the protocol assume an 8-bit clean connection. This means that logging data sent to/from the guest requires escaping characters, which can be a bit annoying...

Sounds like a cool projects, thanks for sharing! And by the way, if you happened to implement a custom arch trait for your project, I'm always eager to merge a PR to upstream it 😉

daniel5151 commented 4 years ago

I snuck a peek at your fork and saw that you had a rough implementation of qOffsets working on the latest master. If you want to open a PR for it, I'd be more than happy to review/merge it!

While the qXfer:libraries:read approach certainly seems more "modern", there's no reason why gdbstub couldn't include qOffsets support as-well. If/when qXfer:libraries:read is implemented, I could just add a doc comment and/or put a #[deprecated] attribute on the qOffsets API to direct people to consider using the new qXfer:libraries:read-backed API instead.

mchesser commented 4 years ago

Yeah, I think supporting regular qOffsets might make sense anyway. qXfer:libraries:read requires quite a bit more work on behalf of the target author to support.

For dynamically linked glibc binaries, generally what I do is load the dynamically linker ld.so at some offset (i.e. the offset I provide in response to the qOffsets) command, and pass control over to ld.so so that it can load the program and its dependencies and handle any relocation fixups.

Inside of the address-space of ld.so there is a link_map object that stores the metadata for all the libraries that ld.so loaded. If the stub just responds to qOffsets (I believe) gdb takes care of extracting the link_map object from memory by issuing a sequence of read_mem commands.

For the stub to properly support qXfer:libraries:read it would need to be aware of the internals of ld.so. You can see gdbserver extracting the information here: https://github.com/bminor/binutils-gdb/blob/febd44f94d944c9058b387a784124dc8e0de58ee/gdbserver/linux-low.cc#L6739

Still, it would be nice to properly support this (allowing targets that use a different dynamic linker to work) – but it is not something I am likely get to any time soon.

For qOffsets support, there appears to be two possible responses. The first uses a relative offset (i.e. the Text=xxx variant) and requires both text and data offset to be specified. The second uses an absolute offset (i.e. TextSeg=xxx) and the data offset is optional. I chose to use the second variant because, my data is always relocated by the same amount as text (so I can avoid needing to specify the data segment).

Should we support both response here?

daniel5151 commented 4 years ago

Oh, wow. I didn't realize just how much work a "proper" qXfer:libraries:read implementation would entail 😳

I totally understand why you might not want to spend the time to implement it 😄

For the stub to properly support qXfer:libraries:read it would need to be aware of the internals of ld.so.

I'm not entirely sure that's the case. Reading the docs, I came across the following tidbit of info:

On some platforms, a dynamic loader (e.g. ld.so) runs in the same process as your application to manage libraries. In this case, GDB can use the loader’s symbol table and normal memory operations to maintain a list of shared libraries. On other platforms, the operating system manages loaded libraries. GDB can not retrieve the list of currently loaded libraries through memory operations, so it uses the ‘qXfer:libraries:read’ packet (see qXfer library list read) instead. The remote stub queries the target’s operating system and reports which libraries are loaded. - https://sourceware.org/gdb/current/onlinedocs/gdb/Library-List-Format.html#Library-List-Format

Also, the code you linked is for the slightly different qXfer:libraries-svr4:read packet, which as far as I can tell, is purely an optimization packet to save GDB the trouble of manually reading ld.so's memory.

I could be totally wrong here (after all, I've never actually worked with this particular feature myself), but I just thought I'd mention this in case it ends up being useful for whoever ends up implementing qXfer:libraries:read support in the future.

As for qOffsets, yep, let's support both kinds of replies. A simple enum with two struct-like variants aught to work fine.

It looks like both replies follow a similar structure, so the handler should be pretty straight forward as well.

mchesser commented 4 years ago

I think qXfer:libraries:read and qXfer:libraries-svr4:read are mutually exclusive – the idea is you can implement qXfer:libraries-svr4:read for System-V targets, and you implement qXfer:libraries:read for non-System V targets. (i.e see here: https://github.com/bminor/binutils-gdb/blob/dac736f6a1c1dbd7f8a30fafac52081886a90122/gdbserver/server.cc#L2401-L2409)

What I think the docs are trying to say, is that if GDB can extract the offsets from the dynamic loader then you do not need to implement either of the two commands (except for optimization purposes).

So, what qOffsets provides is a way to just specify the initial loader offset without going through the process of implementing qXfer:libraries:read or qXfer:libraries-svr4:read since we rely on GDB to extract the library locations.

daniel5151 commented 4 years ago

Hmm, I think that sounds about right? As someone who's mainly been using gdbstub to debug bare-metal emulated code, I can't say I'm too familiar with this particular part of the protocol 😅 Let's shelve this discussion for now, and re-open it if/when someone decides to implement qXfer:libraries:read support.

Oh, and let me know if you hit any roadbumps while working on the qOffsets PR.

I know there's been quite a bit of "churn" on the master branch lately -- sorry about that! I've been trying to settle on a solid API that can carry the project through to 1.0.0 without requiring too many more breaking changes, so release 0.4.0 is looking to be a pretty big change from 0.2.0/0.3.0. I'm pretty sure this current approach should scale well moving forwards, but hey, only time will tell. Maybe I'll stumble upon some obscure-but-useful bit of the protocol that doesn't play nice with the current architecture, who knows ¯\_(ツ)_/¯