lifting-bits / remill

Library for lifting machine code to LLVM bitcode
Apache License 2.0
1.3k stars 145 forks source link

Remove [[gnu::const]] (readnone) from __remill intrinsics #710

Closed mrexodia closed 4 months ago

mrexodia commented 4 months ago

The [[gnu::const]] attribute adds the readnone attribute, for the purpose of enabling better optimizations. The problem with this is that when you actually provide an implementation of the intrinsics, the llvm::CallInst to the intrinsic has a readnone even though the function implementation might actually read/write memory. This causes the optimizer to just completely remove calls to __remill_xxx functions, even though the implementation has side effects like reading/writing global variables.

Removing the readnone attribute will allow LLVM to use 'weaker' attributes (readonly, writeonly) that might be derived from the intrinsic implementation and things will still fold correctly, so it seems completely pointless. Concretely, the following C++ implementation of the helper __remill_read_memory_64:

extern "C" __attribute__((always_inline)) uint64_t __remill_read_memory_64(Memory *m, addr_t a) {
  uint64_t v = 0;
  std::memcpy(&v, &RAM[a], sizeof(v));
  return v;
}

Will generate the following LLVM IR:

; Function Attrs: alwaysinline mustprogress nofree norecurse nosync nounwind readonly willreturn uwtable
define dso_local i64 @__remill_read_memory_64(ptr nocapture noundef readnone %m, i64 noundef %a) local_unnamed_addr #1 {
entry:
  %arrayidx = getelementptr inbounds [0 x i8], ptr @RAM, i64 0, i64 %a
  %v.0.copyload = load i64, ptr %arrayidx, align 1
  ret i64 %v.0.copyload
}

The compiler already figured out that the Memory * argument is not accessed, so it is marked as readnone. The function does not write memory, so it is marked as readonly, etc.

Related: https://github.com/lifting-bits/remill/issues/554, https://github.com/lifting-bits/remill/pull/555.

It would be good to find a concrete example where removing the readnone causes the optimizer to achieve suboptimal results, because we (@fvrmatteo and me) were not able to think of any.

The bug this fixes does not manifest in mcsema because the intrinsics are replaced with llvm::StoreInst/etc directly. We are also aware of multiple people running into this issue and the fix was either to manually remove the readonly or to run an inlining pass first so that the optimizer does not make false assumptions based on the llvm::CallInst (and function) attributes.