Open ayrtonm opened 9 months ago
Running the binaries built from PR #323 showed that libia2.a tries to call pkey_alloc during initialization so it seems I'm missing a few things in the list for porting libia2 code to ARM.
main
to ARM properly might make the most sense.Note: PROT_MTE is only supported on MAP_ANONYMOUS and RAM-based file mappings (tmpfs, memfd). Passing it to other types of mapping will result in -EINVAL returned by these system calls.
https://docs.kernel.org/arch/arm64/memory-tagging-extension.html
I tried checking that accessing static data w/o a tagged ptr fails after enabling MTE with prctl, but I was unable to get a segfault. I also tweaked the prototype @sim-immunant made and it seems that mmap'ing with PROT_MTE works, mprotecting memory previously mmap'ed w/o PROT_MTE using the flag on the second syscall works, but I don't think mprotecting the loaded ELF segments works. mprotect actually returns 0 instead of -EINVAL but that's probably because linux is ignoring errors similar to this.
It might be possible to protect static data using some memfd tricks (or I guess mmaping over the ELF with MAP_FIXED...), but I'll have to think about what the simplest option for getting a test that triggers a fault running.
TODO: add estimates. We should also consider
Since we're already rebuilding glibc might be easier to modify this tunable instead of using PartitionAlloc again.
can we put tag in x18 top byte to stay compatible with shadow call stack
I implemented some x18 switching in #333 and decided to just keep things simple for now and store the tag in bits 56-59 instead of the upper 4. We can revisit this after getting the LLVM fork
So enabling MTE and testing a lot of functionality is blocked on the llvm fork since we expect all accesses to fail w/o it. However with permissive mode we should be able to see where programs fault w/o instrumenting memory accesses. sp-relative accesses won't fault, the heap isn't protected and file backed mappings (i.e. .data) won't be protected so I'm not sure what accesses will fault first other than the one in the PR enabling MTE and TLS probably. Regardless this means permissive mode is a bit more important than I realized since it'll let us make better use of the test suite (even if we need to change test_fault_handler.h a bit).
Running the minimal test in QEMU w/ MTE violations patched out pointed out that the main wrapper asm doesn't actually tag registers before memory accesses. This also applies to the exit wrapper and ALLOCATE_STACK.
Since we're using fixed tags in this case we can use a addg
(note that MTE must be enabled). In other cases where we actually do need to use the x18 tag (i.e. the LLVM instrumentation) we can use
extr R, R, x18, #56
extr R, R, R, #8
where R
is the register used for the memory access and x18
has the tag in the lower 4 bits of the highest byte.
Here I implemented tagging for str, ldr, stp and ldp which covers the most common instructions that access memory.
To see the remaining instructions run that repo's build.sh and grep for "MayStore|MayLoad" in build/lib/Target/AArch64/AArch64GenInstrInfo.inc. Each instruction has "def" operands followed by "use" operands and for each instruction we need to find the "use" operand that corresponds to the address being used. We should not need to tag "def" operands representing addresses. because those registers should be tagged before their use. To find the operand number look for the GPR64sp
(64-bit general-puropse register or stack pointer) operand in the instruction's entry in OpcodeOperandTypes
in that .inc file. If it has two GPR64sp
s it's probably a post/pre-index increment so we tag the latter. If it has more look at the .td files that generate that code (there's no generic patterns here).
Also we need to
I tried building glibc with clang and shadow call stack but I don't think it's worth investing time into. scs requires -fno-exceptions
which a lot of glibc explicitly requires. Even though gcc does support -fsanitize=shadow-call-stack
it'll be an uphill battle. Since CFI isn't needed for MTE tagging to function (only to make sure it's secure), I think we should keep using the glibc prebuilt w/o CFI on QEMU and try to test on android w/ bionic as soon as possible to see what problems crop up when scs is enabled. AFAICT the way we use x18 should be compatible with scs but it's hard to be sure.
We also need to recompile libgcc where e.g. __do_global_dtors_aux
accesses static
s (e.g. completed
).
To add support for ARM64 we can use MTE and the x18 register in place of the PKRU on x64. Here's what we need to do
Testing on QEMU
add_test
tests are mostly expected to fail w/o LLVM fork so there's not much benefit to being able to run them all at oncePort libia2 to ARM
Rewriter and call gate generation
b target_function
(#323)Modify LLVM to instrument memory accesses
We also need to modify LLVM to combine the tag in x18 with pointers before they're dereferenced. LLVM has two pass managers (legacy, new) and I'm not sure how well supported modifying the codegen pipeline with the new pass manager is, so it may be best to just make in-tree changes to LLVM.
Also LLVM has some support for MTE-based hardening, though tagging stack variables is the main thing that seems to be implemented and it's not what we need. Looking through
AArch64StackTagging.cpp
may be a good place to start though for seeing how to insert the instrumentation we'll need.It seems both FastISel and GlobalISel an emit loads and stores so we'll probably need to modify both. I think masking the address is conceptually similar to zero/sign-extending registers so we can grep for ZExt for an idea of how to fix up addresses.
For testing we can start by looking at the resulting assembly for.
For building llvm run the cmake step with
-DLLVM_TARGETS_TO_BUILD=AArch64
and we I think we just need to build thellc
target, though theopt
target may also come in handy if we need to implement our changes as an optimization pass (grepINITIALIZE_PASS_BEGIN
for examples). As for the fork, our llvm-project repo is out of date since it was forked from UC Irvine's repo for mcad so make sure to build from upstream's tip of tree otherwise clang's llvm IR may not be compatible with the llc fork.Open questions