GrammaTech / ddisasm

A fast and accurate disassembler
https://grammatech.github.io/ddisasm/
GNU Affero General Public License v3.0
663 stars 62 forks source link

ls binary fails reassembly on Ubuntu 22 #57

Closed avncharlie closed 1 year ago

avncharlie commented 1 year ago

I am unable to reassemble "ls" using ddisasm and gtirb-pprinter on Ubuntu 22. I am using the grammatech/ddisasm docker image. ddisasm is version 1.6.0 and gtirb-pprinter is version 1.9.0.

When I try to (re)assemble the assembly generated by ddisasm for the "ls" utility, the rewritten binary segfaults.

Commands to reproduce: Generate GTIRB file $ docker run --rm -v $(pwd):/workspace grammatech/ddisasm sh -c "ddisasm /workspace/ls --ir /workspace/out.gtirb"

Generate assembly $ docker run --rm -v $(pwd):/workspace grammatech/ddisasm sh -c "gtirb-pprinter /workspace/out.gtirb --asm /workspace/out.s"

Assemble $ gcc -nostartfiles out.s -o out -lselinux

Resulting binary will segfault

$ ./ls
commands.txt  core.406367  core.412298  core.412544  ls  out  out.gtirb  out.s
$ ./out
[1]    7294 segmentation fault (core dumped)  ./out

Looking at the stack trace from the core dump in GDB shows:

(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x00007f724ca29ebb in call_init (env=<optimised out>, argv=0x7ffd75a6c418, argc=1) at ../csu/libc-start.c:145
#2  __libc_start_main_impl (main=0x5579fe24fd20 <main>, argc=1, argv=0x7ffd75a6c418, init=<optimised out>, fini=<optimised out>,
    rtld_fini=<optimised out>, stack_end=0x7ffd75a6c408) at ../csu/libc-start.c:379
#3  0x00005579fe251ad5 in _start ()

So the rewritten binary is crashing before main is reached.

adamjseitz commented 1 year ago

Hi @avncharlie, thanks for the report.

Unfortunately, I was unable to reproduce the issue. I tested with the latest ddisasm unstable build (c8c7996), as well as the latest stable (1.5.7).

I got the ls binary to test from the ubuntu:22.04 Docker image on Docker Hub:

$ sha256sum ./ls.ubuntu2204
1e39354a6e481dac48375bfebb126fd96aed4e23bab3c53ed6ecf1c5e4d5736d  ./ls.ubuntu2204

As a note, the ddisasm version 1.6.0 (and the corresponding image on Docker Hub) is still an unstable version, and has new builds pushed to it periodically. Because of this, I am not sure the ddisasm version I'm testing is the same build you're using. (We're working on changing our process, so that in the future, only stable builds will be published to specific version number tags, and the unstable build will be available as an unstable tag.) The latest stable build is 1.5.7.

If you're still able to reproduce this on the latest unstable, or on 1.5.7, can you attach your ls binary to the issue? It may have some subtle difference from what I am testing.

avncharlie commented 1 year ago

Thanks for the reply! It looks like the sha256sum you provided is actually from the Ubuntu 20.04 ls binary? This is the sha256sum output I got from the ls binary on both my 22.04 system and the ubuntu:22.04 Docker image:

$ sha256sum ./ls.ubuntu2204
8696974df4fc39af88ee23e307139afc533064f976da82172de823c3ad66f444  ./ls.ubuntu2204

And this is the output I got from my Ubuntu 20.04 installation:

$ sha256sum /usr/bin/ls
1e39354a6e481dac48375bfebb126fd96aed4e23bab3c53ed6ecf1c5e4d5736d  /usr/bin/ls

I tried using ddisasm version 1.5.7 but this still produced a segfault in the rewritten binary. Here is the ls binary I am working with.

adamjseitz commented 1 year ago

You're right, somehow I mixed it up and used the Ubuntu 20.04 ls. Thank you for catching that.

Using the correct binary, I can now replicate the problem.

What is happening is that this binary - for some reason - has .ctors and .dtors sections, rather than .init_array and .fini_array. This is an older convention, and .init_array and .fini_array are considered more modern alternatives; so much so, that some linkers rewrite .ctors and .dtors sections as .init_array and .fini_array. This rewriting by the linker results in the crash, although I am not yet precisely sure why.

The assembly generated by ddisasm/gtirb-pprinter is correct (it retains the .ctors and .dtors sections of the original binary), the problem is how they are handled when re-linking. I think the only solution will be to tweak the options supplied to the linker.

One such workaround is to use the gold linker, which allows us to disable this conversion:

gcc -o ls ls.s -nostartfiles -lselinux -Wl,--no-ctors-in-init-array -fuse-ld=gold

On my end, this workaround results in a functional binary.

I think we should also consider making this behavior automatic if using the gtirb-pprinter --binary option if .ctor and .dtor sections are detected. That would not really help you here, since you're invoking gcc directly.

avncharlie commented 1 year ago

If gtirb-pprinter is producing correct disassembly, could this be a bug in the gcc linker? i.e the .ctors / .dtors overwriting functionality isn't working correctly.

I found this discussion on adding the feature of rewriting .ctors and .dtors sections in the gcc linker: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=46770 if it helps.

Using the gold linker produced a working binary, so marking this issue as closed. Thanks

adamjseitz commented 1 year ago

I did a bit more testing, and you can also work around the problem with ld by specifying a custom linker script, although it's a bit more involved than gold's option. When building with gcc, you can obtain the linker script in use by building with -Wl,--verbose. Copy that to a file and remove the .init_array and .fini_array sections:

  .init_array    :
  {
    PROVIDE_HIDDEN (__init_array_start = .);
    KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*)))
    KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin?.o *crtend.o *crtend?.o ) .ctors))
    PROVIDE_HIDDEN (__init_array_end = .);
  }
  .fini_array    :
  {
    PROVIDE_HIDDEN (__fini_array_start = .);
    KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*)))
    KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin?.o *crtend.o *crtend?.o ) .dtors))
    PROVIDE_HIDDEN (__fini_array_end = .);
  }

Notice how these capture the .ctors and .dtors sections with e.g., SORT_BY_INIT_PRIORITY(.ctors.*))).

Then, you can build specifying your new linker script:

gcc -o ls ls.s -nostartfiles -lselinux -T ./my_script.ld
avncharlie commented 1 year ago

I found this issue on the gtirb-pprinter repository which seems to be about the same ctors / dtors problem: https://github.com/GrammaTech/gtirb-pprinter/issues/3#issuecomment-757014697 The solution here was to skip the ctors and dtors sections while pretty printing. This produced assembly I could assemble with gcc with no extra options.

$ gtirb-pprinter /workspace/out.gtirb --asm /workspace/out.s --skip-section .ctors --skip-section .dtors
$ gcc out.s -o out -nostartfiles -lselinux
adamjseitz commented 1 year ago

The solution here was to skip the ctors and dtors sections while pretty printing.

Of course, it is worth mentioning for future readers that doing it this way results in the rewritten binary having .init_array and .fini_array sections instead of .ctors and .dtors. That may be fine, depending on the application.

However, I do think there may some danger in it. There is a reference to .L_21000, which is in the .ctors section:

#-----------------------------------
.type FUN_170f0, @function
#-----------------------------------
FUN_170f0:
            # ...
            movq .L_21000(%rip),%rax
            cmpq $-1,%rax
            je .L_17130

            pushq %rbp
            movq %rsp,%rbp
            pushq %rbx
            leaq .L_21000(%rip),%rbx
            subq $8,%rsp
            # ...
.L_17118:
            callq *%rax

            movq -8(%rbx),%rax
            subq $8,%rbx
            cmpq $-1,%rax
            jne .L_17118

            movq -8(%rbp),%rbx
            leave
            retq
          .byte 0x66
          .byte 0x90
.L_17130:

            retq

This function is called in the original binary, specified by a DT_INIT tag in the .dynamic section. This code is what results in execution of the .ctors section.

If printed with --skip-section=.ctors, these references are printed as zero:

            movq 0(%rip),%rax # WARNING:0: no symbol for address 0x21000

This code does not seem to be executed in the rewritten binary. Something in the process of rewriting and re-linking loses the DT_INIT tag that specifies it should be executed. Thus, what you are doing works, but it might be more safe if you add --skip-section=.init and --skip-function=FUN_170f0 to ensure this code is elided.