llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.96k stars 11.94k forks source link

[lld elf] Segfault within ld-linux in -nostdlib executable #107749

Open yaram opened 2 months ago

yaram commented 2 months ago

When linking an executable with -nostdlib, no PIE, and no linked dynamic libraries, the executable wil crash with a segmentation fault within audit_list_add_dynamic_tag in the ld-linux interpreter. Switching to PIE, or linking any dynamic libraries fixes the crash.

Steps to reproduce:

  1. Create some simple main.c
    void _start() {
    while(1);
    }
  2. Run the following commands to generate a minimal object and link it into an executable
    
    clang -nostdlib -nostdinc -c -o main.o main.c

clang -fuse-ld=lld -nostdlib -o test main.o

3.
Run the generated executable, producing the segmentation fault
```sh
./test

I suspect the issue is due to missing dynamic segments in the executable trying to be accessed by ld-linux. Perhaps lld or clang frontend is adding the .interp section when it should not be.

DimitryAndric commented 2 months ago

Which version of clang and lld are you using? It works fine for me here, on Ubuntu 24.04.1 LTS:

$ clang-18 -nostdlib -nostdinc -c -o main.o main.c
$ clang-18 -fuse-ld=lld -nostdlib -o test main.o -v
$ ./test
<no crash, just a hang, as expected>

As far as I can see, the interp section is fine:

Program Headers:
  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  PHDR           0x000040 0x0000000000000040 0x0000000000000040 0x000230 0x000230 R   0x8
  INTERP         0x000270 0x0000000000000270 0x0000000000000270 0x00001c 0x00001c R   0x1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  LOAD           0x000000 0x0000000000000000 0x0000000000000000 0x000334 0x000334 R   0x1000
  LOAD           0x000340 0x0000000000001340 0x0000000000001340 0x00000e 0x00000e R E 0x1000
  LOAD           0x000350 0x0000000000002350 0x0000000000002350 0x000080 0x000cb0 RW  0x1000
  DYNAMIC        0x000350 0x0000000000002350 0x0000000000002350 0x000080 0x000080 RW  0x8
  GNU_RELRO      0x000350 0x0000000000002350 0x0000000000002350 0x000080 0x000cb0 R   0x1
  GNU_EH_FRAME   0x0002e0 0x00000000000002e0 0x00000000000002e0 0x000014 0x000014 R   0x4
  GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW  0
  NOTE           0x00028c 0x000000000000028c 0x000000000000028c 0x000018 0x000018 R   0x4
llvmbot commented 2 months ago

@llvm/issue-subscribers-lld-elf

Author: Rebecca Reitsma (yaram)

When linking an executable with `-nostdlib`, no PIE, and no linked dynamic libraries, the executable wil crash with a segmentation fault within `audit_list_add_dynamic_tag` in the ld-linux interpreter. Switching to PIE, or linking any dynamic libraries fixes the crash. Steps to reproduce: 1. Create some simple `main.c` ```c void _start() { while(1); } ``` 2. Run the following commands to generate a minimal object and link it into an executable ```sh clang -nostdlib -nostdinc -c -o main.o main.c clang -fuse-ld=lld -nostdlib -o test main.o ``` 3. Run the generated executable, producing the segmentation fault ```sh ./test ``` I suspect the issue is due to missing dynamic segments in the executable trying to be accessed by ld-linux. Perhaps lld or clang frontend is adding the .interp section when it should not be.
yaram commented 2 months ago

I'm using clang and lld version 18.6.1, on Fedora 40.

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  PHDR           0x0000000000000040 0x0000000000200040 0x0000000000200040
                 0x0000000000000188 0x0000000000000188  R      0x8
  INTERP         0x00000000000001c8 0x00000000002001c8 0x00000000002001c8
                 0x000000000000001c 0x000000000000001c  R      0x1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  LOAD           0x0000000000000000 0x0000000000200000 0x0000000000200000
                 0x000000000000024c 0x000000000000024c  R      0x1000
  LOAD           0x0000000000000250 0x0000000000201250 0x0000000000201250
                 0x000000000000000e 0x000000000000000e  R E    0x1000
  GNU_EH_FRAME   0x00000000000001fc 0x00000000002001fc 0x00000000002001fc
                 0x0000000000000014 0x0000000000000014  R      0x4
  GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000  RW     0x0
  NOTE           0x00000000000001e4 0x00000000002001e4 0x00000000002001e4
                 0x0000000000000018 0x0000000000000018  R      0x4

These are the program headers in the ELF on my system, mine doesn't have DYNAMIC, interesting. Maybe this is due to some distro-specific build configuration of clang and lld.

Here's a stack trace of the crash from GDB btw

#0  audit_list_add_dynamic_tag (list=0x7fffffffd640, main_map=0x7ffff7ffe2e0, tag=1879047932) at rtld.c:225
#1  dl_main (phdr=<optimized out>, phnum=<optimized out>, user_entry=<optimized out>, auxv=<optimized out>) at rtld.c:1783
#2  0x00007ffff7fe49f6 in _dl_sysdep_start (start_argptr=start_argptr@entry=0x7fffffffd950, dl_main=dl_main@entry=0x7ffff7fe6660 <dl_main>) at ../sysdeps/unix/sysv/linux/dl-sysdep.c:141
#3  0x00007ffff7fe635e in _dl_start_final (arg=0x7fffffffd950) at rtld.c:494
#4  _dl_start (arg=0x7fffffffd950) at rtld.c:581
#5  0x00007ffff7fe5048 in _start () from /lib64/ld-linux-x86-64.so.2
#6  0x0000000000000001 in ?? ()
#7  0x00007fffffffdd81 in ?? ()
#8  0x0000000000000000 in ?? ()

When I run the clang link step with --verbose, it shows it's using the following ld.lld options

"/usr/bin/ld.lld" --hash-style=gnu --build-id --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o test -L/usr/bin/../lib/gcc/x86_64-redhat-linux/14 -L/usr/bin/../lib/gcc/x86_64-redhat-linux/14/../../../../lib64 -L/lib/../lib64 -L/usr/lib/../lib64 -L/lib -L/usr/lib main.o

Would be interesting to see how it differs on others' systems

smithp35 commented 1 month ago

It hangs on my Ubuntu 22.04 machine. I think auditing is not enabled by default (see LD_AUDIT) in https://man7.org/linux/man-pages/man8/ld.so.8.html . It could be possible that is being enabled on your Distro?

I suspect that adding --static to the clang driver will resolve this too.

Can you reproduce this problem when using clang and -fuse-ld=bfd? That should rule out whether the clang driver is passing lld the wrong arguments.

If it works with ld.bfd then it will be interesting to see the llvm-readobj --dynamic --sections --segments output for both lld and bfd to see if there's something missing that lld isn't doing.

yaram commented 1 month ago

I tried looking my way around the Fedora glibc source package, there's a lot of audit related things but couldn't make sense of it.

Adding --static to clang does in fact fix the crash.

Using -fuse-ld=bfd also resolves the crash, so that rules out clang passing lld the wrong arguments.

Here's the llvm-readobj --dynamic --sections --segments output for lld then bfd readelf-lld.log readelf-bfd.log

I will have a look at the differnces and note them in another comment

yaram commented 1 month ago

So, LLD adds the .interp section, an the PT_PHDR and PT_INTERP program headers, and BFD adds another PT_LOAD program header, that seems to be the only pieces that are present in one and not the other. They are in a different order and at different offests but that is to be expected

smithp35 commented 1 month ago

It is likely that it is just the .interp section. On GNU ld without that the dynamic loader won't be called (equivalent of --static). The clang driver is passing --dynamic-linker=... to lld, but perhaps LLD should ignore that if it isn't needed.

In any case. I'd recommend --static as the closest equivalent to GNU ld as a workaround.

yaram commented 1 month ago

I ended up using -pie a workaround in my use case, which adds the .dynsym, .gnu.hash and .dynstr, .dynamic, and .relro_padding sections, and the PT_DYNAMIC and PT_GNU_RELRO program headers, which fixes the crash