lifting-bits / remill

Library for lifting machine code to LLVM bitcode
Apache License 2.0
1.22k stars 142 forks source link

Opaque Pointers Lead to Incorrect Register Sizes by Pulling a Type from `getResultElementType` for a Register that is not in the Operand Cache but is Declared as a Variable in the Lifted Function #618

Closed 2over12 closed 1 year ago

2over12 commented 1 year ago

This issue is exacerbated by #615 since instructions get individual lifters to lift themselves, not sharing a cache (this probably should be addressed somehow, but cache consistency becomes tricky), causing register cache misses to occur more frequently. If we have a cache miss for lifting a register operand that is, however, found in the function as a GEP we pull a type from getResultElementType but this type does not match the desired type for the register. For instance, if we GEP a 32 bit register W0 then call getResultElementType on the GEP we are going to get a 64 bit type because of the size of the underlying field in the state structure. This line https://github.com/lifting-bits/remill/blob/c0f90b9e4a251615288ed4b65ead81cea617da8c/lib/BC/Util.cpp#L220 will not pull the correct type, we need to source the type info from the Arch's knowledge about the reg name being asked for and not from the state structure.

Regression test here: https://github.com/lifting-bits/remill/blob/c829715b95a69401a0f439b8004962a3dda6e803/tests/Thumb/TestLifting.cpp#L236

TEST(RegressionTests, AARCH64RegSize) {
  llvm::LLVMContext context;
  context.enableOpaquePointers();
  auto arch = remill::Arch::Build(&context, remill::OSName::kOSLinux,
                                  remill::ArchName::kArchAArch64LittleEndian);
  auto sems = remill::LoadArchSemantics(arch.get());
  remill::IntrinsicTable instrinsics(sems.get());
  auto op_lifter = arch->DefaultLifter(instrinsics);
  auto target_lift = arch->DefineLiftedFunction("test_lift", sems.get());
  auto st_ptr = remill::LoadStatePointer(target_lift);
  CHECK_NOTNULL(st_ptr);
  auto lifted =
      op_lifter->LoadRegValue(&target_lift->getEntryBlock(), st_ptr, "W0");

  CHECK_EQ(lifted->getType()->getIntegerBitWidth(), 32);
  op_lifter->ClearCache();
  auto lifted2 =
      op_lifter->LoadRegValue(&target_lift->getEntryBlock(), st_ptr, "W0");

  CHECK_EQ(lifted2->getType()->getIntegerBitWidth(), 32);
}

This test fails on the second check because the variable finding code returns a 64 bit size type for W0 on the second call.