ykjit / yk

yk packages
https://ykjit.github.io/yk/
Other
31 stars 7 forks source link

Re-enabling PIE #923

Open vext01 opened 10 months ago

vext01 commented 10 months ago

Problem Description

PIE has been disabled by default in the JIT for some time.

The reason for this is if you enable PIE you will get an error like:

ld.lld: error: relocation R_X86_64_64 cannot be used against local symbol; recompile with -fPIC
>>> defined in lto.tmp
>>> referenced by ld-temp.o
>>>               lto.tmp:(.llvm_bb_addr_map+0x2)

ld.lld: error: relocation R_X86_64_64 cannot be used against local symbol; recompile with -fPIC

In other words, the basic block map section doesn't link.

You would not get this with upstream LLVM if you enabled the basic block section. The reason we see it is that in ykllvm, we have added the SHF_ALLOC flag to avoid reading the ELF file at runtime: when you add the SHF_ALLOC flag the loader will load the section as the process image is bootstrapped.

But there's another effect of SHF_ALLOC that we haven't considered before: symbol values emitted with LLVM's MCStreamer::emitSymbolValue() will get relocated!

The error above is the loader saying (in it's own way) that it cannot relocate a symbol in the basic block section because to do so would require the section to be writable. It seems relocating with PIE requires the section to be patched at load time.

This can be verified by enabling PIE by default and applying the following patch to ykllvm:

diff --git a/llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp b/llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
index 1185d770a7dc..74630ef8f24a 100644
--- a/llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
+++ b/llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
@@ -814,7 +814,7 @@ static MCSection *selectExplicitSectionGlobal(
   // The Yk JIT expects to load the IR from its address space. This tells the
   // loader to load the section.
   if (YkAllocLLVMBCSection && (SectionName == ".llvmbc"))
-    Flags |= llvm::ELF::SHF_ALLOC;
+    Flags |= llvm::ELF::SHF_ALLOC |llvm::ELF::SHF_WRITE;

   MCSectionELF *Section = Ctx.getELFSection(
       SectionName, getELFSectionType(SectionName, Kind), Flags, EntrySize,
diff --git a/llvm/lib/MC/MCObjectFileInfo.cpp b/llvm/lib/MC/MCObjectFileInfo.cpp
index e577abcfdee3..1642982b24f0 100644
--- a/llvm/lib/MC/MCObjectFileInfo.cpp
+++ b/llvm/lib/MC/MCObjectFileInfo.cpp
@@ -1147,7 +1147,7 @@ MCObjectFileInfo::getBBAddrMapSection(const MCSection &TextSec) const {
   }

   if (YkAllocLLVMBBAddrMapSection)
-    Flags |= ELF::SHF_ALLOC;
+    Flags |= ELF::SHF_ALLOC | ELF::SHF_WRITE;

   // Use the text section's begin symbol and unique ID to create a separate
   // .llvm_bb_addr_map section associated with every unique text section.

(Note that the same problem seems to exist for the embedded bitcode section)

This will mean that tests link once again, but something breaks in the PT decoder and the mapper which expect to be reading ELF file offsets out of the basic block sections, but which in fact is being handed virtual addresses due to SHF_ALLOC (Those parts of the system use conversion utilities in ykaddr to convert from/to offsets and virtual addresses, but if you SHF_ALLOC the section, everything is a virtual address, so the conversions become unsound).

It works now without PIE because without PIE a file offset for the main object and a virtual address for the main object are the same. But as soon as you enable PIE, the main object gets relocated.

How to fix?

The truth of the matter is we've added SHF_ALLOC to sections that were never designed to be loaded.

There are two ways to move forward:

The former is arguably more correct and easier to fix, but maybe a performance hit, because if you don't load the section, you have to open the ELF binary to find the section. But isn't that all the loader is doing anyway?

The latter diverges from upstream's intent of the sections and is more "uncharted territory".

Reading list

vext01 commented 10 months ago

Just to add: I believe the system is not quite right even with PIE off. By adding SHF_ALLOC to the special sections we are opting to have them relocated meaning that the symbol addresses inside become virtual addresses, but we assume they are file offsets. It works because without PIE the relocation step happens to be a NOP.

vext01 commented 10 months ago

Some more on the PIE saga.

After removing SHF_ALLOC from the bitcode and basic block map sections, and having that data loaded from disk instead, the mapper fails when PIE is enabled. This is because some symbols become undefined when:

To demonstrate, take a hello world C program in a.c:

No PIE, with stackmaps

$ clang -fno-PIE -nopie -flto -fuse-ld=lld -Wl,--mllvm=--yk-insert-stackmaps a.c && nm  a.out | grep main                                    
00000000002016a0 T main

With PIE, with stackmaps

$ clang -fPIE -pie -flto -fuse-ld=lld -Wl,--mllvm=--yk-insert-stackmaps a.c && nm a.out | grep main
                 U main

With PIE, no stackmaps

$ clang -fPIE -pie -flto -fuse-ld=lld a.c && nm  a.out | grep main
0000000000001700 T main

Having undefined symbols causes dladdr() (and probably dlsym()) to fail to find symbols, meaning that the mapper erroneously reports some blocks as unmappable.

After some fiddling, I've narrowed the issue to the .llvm_stackmaps section being SHF_ALLOC (this time this isn't our addition: it's marked SHF_ALLOC in upstream llvm).

An obvious next step to remove SHF_ALLOC from the stackmap section:

diff --git a/llvm/lib/MC/MCObjectFileInfo.cpp b/llvm/lib/MC/MCObjectFileInfo.cpp
index f9a67105820f..1155b62aea9c 100644
--- a/llvm/lib/MC/MCObjectFileInfo.cpp
+++ b/llvm/lib/MC/MCObjectFileInfo.cpp
@@ -522,7 +522,7 @@ void MCObjectFileInfo::initELFMCObjectFileInfo(const Triple &T, bool Large) {
       Ctx->getELFSection(".debug_tu_index", DebugSecType, 0);

   StackMapSection =
-      Ctx->getELFSection(".llvm_stackmaps", ELF::SHT_PROGBITS, ELF::SHF_ALLOC);
+      Ctx->getELFSection(".llvm_stackmaps", ELF::SHT_PROGBITS, 0);

   FaultMapSection =
       Ctx->getELFSection(".llvm_faultmaps", ELF::SHT_PROGBITS, ELF::SHF_ALLOC);

If you do this then all symbols are accounted for, and dladdr() can find them:

$ clang -fPIE -pie -flto -fuse-ld=lld -Wl,--mllvm=--yk-insert-stackmaps a.c && nm  a.out | grep main
0000000000001700 T main

But this causes issues further down the line. By marking the stackmap section as unloadable, when LLVM generates code at runtime and loads the resulting shared objects, they are devoid of stackmap sections. Of course they are: we asked for them not to be loaded...

We need those for deopt...

vext01 commented 10 months ago

Maybe we can only add SHF_ALLOC to the JITted modules. The thing is, I can't see any way to determine if we are JITting in MCObjectFileInfo::initELFMCObjectFileInfo().

vext01 commented 10 months ago

lhames in llvm discord:

My gut reaction is that this is a compiler bug -- I can't see why having SHF_ALLOC on the stackmap section should cause other symbols to be left out of the dynamic symbol table.

To move this forward we should try to reproduce this with upstream LLVM. Perhaps we can compile a .ll file containing stackmap calls to avoid requiring our custom stackmap pass.

vext01 commented 10 months ago

Raised a bug upstream for this: https://github.com/llvm/llvm-project/issues/75074