davidlattimore / wild

Apache License 2.0
468 stars 11 forks source link

Various TLS tests failures (from Mold test-suite) #27

Open marxin opened 1 month ago

marxin commented 1 month ago

It seems to me the Mold linker provides a nice way how to test various TSL-related cases:

$ /home/marxin/Programming/mold/build
$ cp ~/Programming/wild/target/release/wild mold 
$ ctest
...
$ grep 'Failed.*tls' /home/marxin/Programming/mold/build/Testing/Temporary/LastTest.log
Error: Failed to activate out/test/elf/x86_64/x86_64_tls-gd-mcmodel-large/b.o (768 (3/0))
Error: Failed to activate out/test/elf/x86_64/x86_64_tls-gd-to-ie/a.o (768 (3/0))
Error: Failed to load symbol `main` (78 local=10) in file #1024 (4/0) (out/test/elf/x86_64/x86_64_tls-ld-mcmodel-large/a.o) (ADDRESS | CAN_BYPASS_GOT) from out/test/elf/x86_64/x86_64_tls-ld-mcmodel-large/a.o (1024 (4/0))
Error: Failed to load symbol `get_foo` (70 local=2) in file #1024 (4/0) (out/test/elf/x86_64/x86_64_tls-module-base/a.o) (ADDRESS | CAN_BYPASS_GOT) from out/test/elf/x86_64/x86_64_tls-module-base/a.o (1024 (4/0))
Error: Failed to load symbol `main` (77 local=5) in file #1280 (5/0) (out/test/elf/x86_64/x86_64_tlsdesc/b.o) (ADDRESS | CAN_BYPASS_GOT) from out/test/elf/x86_64/x86_64_tlsdesc/b.o (1280 (5/0))
Error: Failed copying from out/test/elf/x86_64/tls-common/a.o (1024 (4/0)) to output file
Error: Failed to activate out/test/elf/x86_64/tlsdesc-dlopen/a.o (768 (3/0))
Error: Failed to load symbol `main` (72 local=4) in file #1024 (4/0) (out/test/elf/x86_64/tlsdesc-import/a.o) (ADDRESS | CAN_BYPASS_GOT) from out/test/elf/x86_64/tlsdesc-import/a.o (1024 (4/0))
Error: Failed to load symbol `get_foo1` (71 local=3) in file #1024 (4/0) (out/test/elf/x86_64/tlsdesc-initial-exec/c.o) (ADDRESS | CAN_BYPASS_GOT) from out/test/elf/x86_64/tlsdesc-initial-exec/c.o (1024 (4/0))
Error: Failed to load symbol `main` (82 local=5) in file #1280 (5/0) (out/test/elf/x86_64/tlsdesc-local-dynamic/b.o) (ADDRESS | CAN_BYPASS_GOT) from out/test/elf/x86_64/tlsdesc-local-dynamic/b.o (1280 (5/0))
Error: Failed to load symbol `main` (79 local=4) in file #257 (1/1) (out/test/elf/x86_64/tlsdesc-static/a.o) (ADDRESS | CAN_BYPASS_GOT) from out/test/elf/x86_64/tlsdesc-static/a.o (257 (1/1))
Error: Failed to load symbol `main` (82 local=5) in file #1280 (5/0) (out/test/elf/x86_64/tlsdesc/b.o) (ADDRESS | CAN_BYPASS_GOT) from out/test/elf/x86_64/tlsdesc/b.o (1280 (5/0))

Note, that one can run a single test by simply running the bash script, e.g.:

❯ /home/marxin/Programming/mold/test/elf/x86_64_tls-ld-mcmodel-large.sh
Testing x86_64_tls-ld-mcmodel-large ... + cat
+ gcc -ftls-model=local-dynamic -fPIC -c -o out/test/elf/x86_64/x86_64_tls-ld-mcmodel-large/a.o -xc - -mcmodel=large
+ cat
+ gcc -ftls-model=local-dynamic -fPIC -c -o out/test/elf/x86_64/x86_64_tls-ld-mcmodel-large/b.o -xc - -mcmodel=large
+ cc -B. -o out/test/elf/x86_64/x86_64_tls-ld-mcmodel-large/exe out/test/elf/x86_64/x86_64_tls-ld-mcmodel-large/a.o out/test/elf/x86_64/x86_64_tls-ld-mcmodel-large/b.o -mcmodel=large
Error: Failed to load symbol `main` (78 local=10) in file #1024 (4/0) (out/test/elf/x86_64/x86_64_tls-ld-mcmodel-large/a.o) (ADDRESS | CAN_BYPASS_GOT) from out/test/elf/x86_64/x86_64_tls-ld-mcmodel-large/a.o (1024 (4/0))

Caused by:
    Unsupported relocation type 29
collect2: error: ld returned 1 exit status
++ on_error 25
++ code=1
++ echo 'command failed: 25: $CC -B. -o $t/exe $t/a.o $t/b.o -mcmodel=large'
command failed: 25: $CC -B. -o $t/exe $t/a.o $t/b.o -mcmodel=large
++ trap - EXIT
++ exit 1
davidlattimore commented 1 month ago

Thanks! I've added basic support for -mcmodel=large, but I added the the flag to the cpp integration test, which doesn't use any TLS. I'm now looking into adding it to the libc-integration test, which does use TLS and there seems to be more to do....

davidlattimore commented 3 weeks ago

I've done some more work on -mcmodel=large. A lot of the work was actually changing linker-diff to be able to handle relocations in ways that it couldn't previously. For example, there were relocations that were PC-relative, but where the instructions weren't PC-relative. That was something I hadn't expected. The assembly that the compiler emits is valid though, because it has a PC-relative instruction before the one with the relocation. Anyway, linker-diff now handles this better, although there's still lots more to be done with both linker-diff and with the the linker itself.

marxin commented 3 weeks ago

Thanks for the improvement. Please check the integration_tests after your changes on openSUSE TW:

---- integration_test stdout ----
wild: /home/marxin/Programming/wild/wild/tests/build/libc-integration.c-gcc-dynamic-pie-large.wild
ld: /home/marxin/Programming/wild/wild/tests/build/libc-integration.c-gcc-dynamic-pie-large.ld
asm.main
  ORIG            No layout information in range 402b4e..402ed9 (has Some(402980..4042d5))
                  push %rbp
                  mov %rsp,%rbp
                  push %r15
                  push %rbx
                  sub $0x30,%rsp
                  lea 0xB,%rbx

  wild 0x00402b60 49 bb af 13 00 00 00 00 00 00 movabs $0x13AF,%r11  // Mov_r64_imm64(0x13af) R_X86_64_GOTPC64 at 0x2 for `_GLOBAL_OFFSET_TABLE_` +9
  ld   0x004012e6 49 bb 09 2d 00 00 00 00 00 00 movabs $0x2D09,%r11  // Mov_r64_imm64(0x2d09)
  ORIG            49 bb 00 00 00 00 00 00 00 00 movabs $0,%r11  // R_X86_64_GOTPC64 -> `_GLOBAL_OFFSET_TABLE_`+0x9
  TRACE           value_flags=ADDRESS | CAN_BYPASS_GOT resolution_flags=DIRECT

                  add %r11,%rbx
                  mov %fs:0xFFFFFFFFFFFFFFFC,%eax
                  test %eax,%eax
                  je 0x0000000000000035
                  mov $0x65,%eax
                  jmp 0x0000000000000382
                  mov %fs:0xFFFFFFFFFFFFFFF4,%eax
                  cmp $0x46,%eax
                  je 0x000000000000004C
                  mov $0x66,%eax
                  jmp 0x0000000000000382
                  movl $0x14,%fs:0xFFFFFFFFFFFFFFFC
                  lea -0x34(%rbp),%rdx
                  lea -0x30(%rbp),%rax
                  mov %rdx,%rcx

...  
  wild 0x00402b18 48 b8 68 ea ff ff ff ff ff ff movabs $0xFFFFFFFFFFFFEA68,%rax  // Mov_r64_imm64(0xffffffffffffea68) PLT(DYNAMIC(memset@GLIBC_2.2.5))
  ld   0x0040129e 48 b8 88 d0 ff ff ff ff ff ff movabs $0xFFFFFFFFFFFFD088,%rax  // Mov_r64_imm64(0xffffffffffffd088) PLT(DYNAMIC(compute_value10@*global*))
  ORIG            48 b8 00 00 00 00 00 00 00 00 movabs $0,%rax  // R_X86_64_PLTOFF64 -> `memset`+0x0
  TRACE           value_flags=DYNAMIC resolution_flags=GOT | PLT

                  add %rbx,%rax
                  call *%rax
                  movl $0xA,%fs:0xFFFFFFFFFFFFFFFC
                  mov -0x28(%rbp),%rax
                  mov %rax,-0x20(%rbp)
                  mov -0x20(%rbp),%rax
                  movl $0x1E,(%rax)
                  add $0x20,%rsp
                  pop %rbx
                  pop %r15
                  pop %rbp
                  ret

Error: Validation failed.
Binary `/home/marxin/Programming/wild/wild/tests/build/libc-integration.c-gcc-dynamic-pie-large.wild`. Relink with:
WILD_WRITE_LAYOUT=1 WILD_WRITE_TRACE=1 OUT=/home/marxin/Programming/wild/wild/tests/build/libc-integration.c-gcc-dynamic-pie-large.wild /home/marxin/Programming/wild/wild/tests/build/libc-integration.c-gcc-dynamic-pie-large.save/run-with cargo run --bin wild --
 To revalidate:
cargo run --bin linker-diff -- --wild-defaults --ignore '.got.plt,.dynamic.DT_PLTGOT,.dynamic.DT_JMPREL,.dynamic.DT_NEEDED,.dynamic.DT_PLTREL,.dynamic.DT_FLAGS,.dynamic.DT_FLAGS_1,section.plt.entsize,section.rodata.cst32.entsize,section.rela.plt.link' --ref /home/marxin/Programming/wild/wild/tests/build/libc-integration.c-gcc-dynamic-pie-large.ld /home/marxin/Programming/wild/wild/tests/build/libc-integration.c-gcc-dynamic-pie-large.wild
davidlattimore commented 3 weeks ago

Oops. I'll have to try to make sure I run the tests on OpenSUSE before I push. Should be fixed now. Seemed like it was mostly due to differences in the way the global offset table is set up.

marxin commented 3 weeks ago

The tests work for me now on openSUSE, thanks.

marxin commented 3 weeks ago

Just a note that it seems the initial-exec TLS mode should include DF_STATIC_TLS DT_FLAGS for an executable or shared library: https://refspecs.linuxbase.org/elf/gabi4+/ch5.dynamic.html.