Open atrosinenko opened 10 months ago
I have no clear solution for this right now, below are some observations.
As far as I understand the DWARF 5 specification, section 1.3.7, it should be preferable to describe signed pointer values explicitly:
1.3.7 Explicit Rather Than Implicit Description
DWARF describes the source to object translation explicitly rather than using
common practice or convention as an implicit understanding between producer
and consumer. For example, where other debugging formats assume that a
debugger knows how to virtually unwind the stack, moving from one stack
frame to the next using implicit knowledge about the architecture or operating
system, DWARF makes this explicit in the Call Frame Information description.
According to aadwarf64 document, there are several AArch64 platform-specific extensions to support PAuth. While DW_CFA_AARCH64_negate_ra_state
call frame instruction is already used by LLVM, I did not find mentioning DW_SUB_OP_AARCH64_sign
in current LLVM (and anyway, it looks like I have the reverse problem: explain that the value fetched from the target process is already signed and should be XPAC-ed before use in some contexts).
To the extent I currently understand DWARF, it looks working to me to introduce something like DW_SUB_OP_AARCH64_xpac
operation and make LLVM generate debug information like "ptr := (xpaci (reg_value X1))". On the other hand, in the original example, LLDB prints ptr
as 0x003caaaaaaab04f0 (actual=0x0000aaaaaaab04f0 ...)
. This seems definitely useful to see the value both as a signed pointer and as a plain VMA. Thus, maybe this should be encoded not as an expression ("how to obtain the value") but as a type ("what we have obtained").
My initial thought was to look at what is generated for TBI on AArch64, as it is another case of placing something loosely related into the higher bits of address value. This can be achieved by compiling an example program with HWAsan enabled. Unfortunately, I did not find anything interesting in the DWARF description generated for code that uses TBI. And that looks reasonable because TBI by definition makes all those 256 pointer values (differing by 8-bit tag) valid w.r.t. address translation.
I tried patching the existing debug information by adding DW_AT_bit_size (48)
attribute to the base DW_TAG_pointer_type
abbreviation but not yet managed to affect debugger in any observable way.
Just in case, it is possible to adjust the debug information manually after it was produced by code generator. The *.s file produced by clang contains lots of .byte
(with meaningful DWARF names in comments). It may be easier to change the existing abbreviation (or copy of it) as adding/removing anything in .debug_info
section instead may require adjusting byte offsets.
On the other hand, maybe trying to convey the fact that the pointer is signed via DWARF sections is overengineering. If so, should lldb-server on AArch64 just unconditionally clear top-most 16 bits before dereferencing the pointer (by data load/store or control flow) in jitted code or should such behavior be enabled somehow for the particular target process as a whole.
Tagging @kbeyls and @smithp35
Also tagging @DavidSpickett
If so, should lldb-server on AArch64 just unconditionally clear top-most 16 bits before dereferencing the pointer (by data load/store or control flow)
Well, this might have some interesting consequences. For example, if for some reason the signature will be wrong, then we will be unable to debug such issue. Debugger will just strip the signature and everything will work, while in the normal runtime we'll end dereferencing invalid pointer.
I think we should be explicit that things are signed and IMO it's a property of the type (after all, void*
and __ptrauth void*
are different types from the compiler perspective and conversion between them is not a noop). Also we should be able to describe the particular signing scheme used for the particular pointer (discriminator used, key, whether address discrimination is used, etc.)
Also, the things should be organized in some fine-grained basis, e.g. we can easily have a mixture of signed / unsigned pointers in some stub codes.
The orthogonal question is expression JIT as here we need to generate properly signed pointers as far as I can see as they could escape.
Well, this might have some interesting consequences. For example, if for some reason the signature will be wrong, then we will be unable to debug such issue. Debugger will just strip the signature and everything will work, while in the normal runtime we'll end dereferencing invalid pointer.
This is what happens already (in lldb mostly, though a little bit in lldb-server), it was deemed a decent compromise until we had other signals to go off of to tell whether it was signed.
See https://www.linaro.org/blog/lldb-15-and-the-mystery-of-the-non-address-bits/ "Corrupted Pointer or Non-Address Bits?"
If we know that we're in this ABI and that there is this annotation, we can use that to be more strict.
There is also the issue of whether you want to be able to give signed pointers to commands like memory read
. It may be useful to pass it a pointer that just faulted, to see what it would have read if it was valid.
Another command example, do you expect memory region
to remove non-address bits for you, or require the user to do so if it's a signed pointer? I'd say the user shouldn't have to wait for the program to authenticate it, just so they can find out what memory region it points to. So it's a command by command thing I think, and needs some usage to decide what's best.
My vague thought here is that you would make the AArch64 ABI plugin aware of whether the PAuth ABI is being used, and it would change its Fix..Address
methods accordingly. Special cases like the exception printer will need some way to always clear the signature bits for the actual:...
bit. If there is a program file level attribute for the PAuth ABI this could be passed to the plugin to achieve this.
So:
A couple of thoughts how signing schema might be encoded in Dwarf.
The most straight-forward and simple way that comes into mind is adding a new attribute, say, DW_AT_signing_schema
, which would store a combination of the key, discriminator and address diversity flag inside an integer.
If a DW_TAG_*
entity is meant to be signed, it should have DW_AT_signing_schema
set correspondingly. For example, for a signed vtable pointer of a polymorphic class, the DW_TAG_member
with DW_AT_name = ("_vptr$classname")
would have DW_AT_signing_schema
set.
The place where we obtain the context for a user expression is lldb_private::plugin::dwarf::SymbolFileDWARF::ParseDeclsForContext
(see the full stack under spoiler below). We can, for example, change DWARFASTParser::GetTypeForDIE
so it encounters the signing scheme when obtaining type. As a result, the JIT compiler would have a type with explicit ptrauth attributes.
As far as I can see now, such a DW_AT_signing_schema
attribute should be enough. Please let me know if it looks reasonable and if there are cases when we might need dwarf expressions with dwarf operations like existing DW_SUB_OP_AARCH64_sign
but for stripping/authenticating.
Update. It turns out that we already have DW_TAG_LLVM_ptrauth_type
with the following attributes:
DW_AT_LLVM_ptrauth_key
DW_AT_LLVM_ptrauth_address_discriminated
DW_AT_LLVM_ptrauth_extra_discriminator
DW_AT_LLVM_ptrauth_isa_pointer
DW_AT_LLVM_ptrauth_authenticates_null_values
It could be used instead of DW_AT_signing_schema
proposed above. The idea remains the same - attach a signing schema to DW_TAG_*
entities meant to be signed. We even have some related code already - see https://github.com/access-softek/llvm-project/blob/elf-pauth/lldb/source/Plugins/SymbolFile/DWARF/DWARFDIE.cpp#L310.
Make possible to transparently use signed pointers fetched from the target process in LLDB expressions. Presently, LLDB doesn't take into account pointers being signed when dereferencing them (resulting in SEGV or some other sort of "invalid pointer" error).
Let's consider the following code:
indirect-call-for-lldb.c:
Compile it using our toolchain (commit 62ce88f0f2a77665529947f20d276d640c37f76f) with the following command:
Inside QEMU, segmentation fault is observed when trying to perform indirect call.
_Note: I use
-cpu max,pauth=on,pauth-impdef=on
QEMU options to get reasonable simulation performance, so PACs usually don't look very random (0x003c
in the below example)._