lifting-bits / remill

Library for lifting machine code to LLVM bitcode
Apache License 2.0
1.27k stars 145 forks source link

Get Remill building with LLVM 15 #631

Closed tetsuo-cpp closed 1 year ago

tetsuo-cpp commented 2 years ago

Relevant release notes here: https://releases.llvm.org/15.0.0/docs/ReleaseNotes.html#changes-to-the-llvm-ir

ekilmer commented 2 years ago

I tried this branch on Fedora 36 with LLVM 15.0.1 and 15.0.2 from cxx-common, and I'm getting some errors after building successfully with cmake --build build-dbg

$ cmake --build build-dbg --target test_dependencies
[15/38] Generating tests_x86.bc
FAILED: tests/X86/tests_x86.bc /tmp/work/cxx-common/remill/build-dbg/tests/X86/tests_x86.bc
cd /tmp/work/cxx-common/remill/build-dbg/tests/X86 && /tmp/work/cxx-common/remill/build-dbg/tests/X86/lift-x86-tests --arch x86 --bc_out tests_x86.bc
lift-x86-tests: /home/ekilmer/src/cxx-common/vcpkg/installed/x64-linux/include/llvm/IR/DataLayout.h:674: llvm::TypeSize llvm::DataLayout::getTypeSizeInBits(llvm::Type *) const: Assertion `Ty->isSized() && "Cannot getTypeInfo() on a type that is unsized!"' failed.
[16/38] Generating tests_amd64.bc
FAILED: tests/X86/tests_amd64.bc /tmp/work/cxx-common/remill/build-dbg/tests/X86/tests_amd64.bc
cd /tmp/work/cxx-common/remill/build-dbg/tests/X86 && /tmp/work/cxx-common/remill/build-dbg/tests/X86/lift-amd64-tests --arch amd64 --bc_out tests_amd64.bc
lift-amd64-tests: /home/ekilmer/src/cxx-common/vcpkg/installed/x64-linux/include/llvm/IR/DataLayout.h:674: llvm::TypeSize llvm::DataLayout::getTypeSizeInBits(llvm::Type *) const: Assertion `Ty->isSized() && "Cannot getTypeInfo() on a type that is unsized!"' failed.
[17/38] Generating tests_x86_avx.bc
FAILED: tests/X86/tests_x86_avx.bc /tmp/work/cxx-common/remill/build-dbg/tests/X86/tests_x86_avx.bc
cd /tmp/work/cxx-common/remill/build-dbg/tests/X86 && /tmp/work/cxx-common/remill/build-dbg/tests/X86/lift-x86_avx-tests --arch x86_avx --bc_out tests_x86_avx.bc
lift-x86_avx-tests: /home/ekilmer/src/cxx-common/vcpkg/installed/x64-linux/include/llvm/IR/DataLayout.h:674: llvm::TypeSize llvm::DataLayout::getTypeSizeInBits(llvm::Type *) const: Assertion `Ty->isSized() && "Cannot getTypeInfo() on a type that is unsized!"' failed.
[18/38] Generating tests_amd64_avx.bc
FAILED: tests/X86/tests_amd64_avx.bc /tmp/work/cxx-common/remill/build-dbg/tests/X86/tests_amd64_avx.bc
cd /tmp/work/cxx-common/remill/build-dbg/tests/X86 && /tmp/work/cxx-common/remill/build-dbg/tests/X86/lift-amd64_avx-tests --arch amd64_avx --bc_out tests_amd64_avx.bc
lift-amd64_avx-tests: /home/ekilmer/src/cxx-common/vcpkg/installed/x64-linux/include/llvm/IR/DataLayout.h:674: llvm::TypeSize llvm::DataLayout::getTypeSizeInBits(llvm::Type *) const: Assertion `Ty->isSized() && "Cannot getTypeInfo() on a type that is unsized!"' failed.
ninja: build stopped: subcommand failed.
tetsuo-cpp commented 1 year ago

With LLVM 15, State is an opaque struct for some reason (LLVM 14 gives the output that I'd expect). The IR looks like this:

%struct.State = type opaque

And the spots where the Struct gets accessed, looks like this:

  %n.i = getelementptr inbounds %struct.AArch64State, ptr %state, i64 0, i32 9, i32 5

It's semantically equivalent since State is defined a struct State : public AArch64State {};, but I'm not sure why it's happening.

Still trying to find a minimal repro.

My initial hunch is that it's an llvm-link regression because the State definition is definitely in the C++ code that produces the runtime.

tetsuo-cpp commented 1 year ago

Ok, the problem is that the new Clang's behaviour around -emit-llvm has changed a bit. It seems that it's more aggressive about removing unused types.

Essentially, it's not enough to have the struct definition included in Instructions.cpp as it gets stripped out from the output bytecode. By the time we get around to calling llvm-link, none of the bytecode modules has a definition for State.

I think we need to have a definition for __remill_state (previously we just declared it with extern but didn't define it in any of our modules). The IR relating to __remill_state has changed slightly but I don't think it really matters as it's just a way for us to get a handle on the llvm::StructType for State.

Before

@__remill_state = external global %struct.State, align 1

After

@__remill_state = global %struct.State zeroinitializer, align 16
ekilmer commented 1 year ago

cxx-common now has pre-built LLVM 15 https://github.com/lifting-bits/cxx-common/releases/tag/v0.2.11

Can you update the CI to test with LLVM 15 for this PR, please