llvm / clangir

A new (MLIR based) high-level IR for clang.
https://clangir.org
Other
327 stars 86 forks source link

Vtable support for simple multiple inheritance without thunk #566

Open Laity000 opened 3 months ago

Laity000 commented 3 months ago

I found that CIR currently does not support multiple inheritance, so I am trying to support this feature. However, I encountered two issues:

  1. Due to offset-to-top in vtable being signed value for multiple inheritance. Should the type of RTTI pointers be cir.ptr<!s8i> instead of cir.ptr<!u8i>?
class Mother {
  virtual void MotherFoo() {}
  virtual void MotherFoo2() {}
}

class Father {
  virtual void FatherFoo() {}
}

class Child : public Mother, public Father {
  void MotherFoo() override {}
}

llvm ir 12.0.1:

@vtable for Child = linkonce_odr dso_local unnamed_addr constant { [4 x i8*], [3 x i8*] } { 
  [4 x i8*] [
    i8* null,
    i8* bitcast ({ i8*, i8*, i32, i32, i8*, i64, i8*, i64 }* @typeinfo for Child to i8*),
    i8* bitcast (void (%class.Child*)* @Child::MotherFoo() to i8*),
    i8* bitcast (void (%class.Mother*)* @Mother::MotherFoo2() to i8*)], 
  [3 x i8*] [
    i8* inttoptr (i64 -8 to i8*),
    i8* bitcast ({ i8*, i8*, i32, i32, i8*, i64, i8*, i64 }* @typeinfo for Child to i8*),
    i8* bitcast (void (%class.Father*)* @Father::FatherFoo() to i8*)] }, comdat, align 8

The modified cir:

cir.global linkonce_odr @_ZTV5Child = #cir.vtable<
{#cir.const_array<[
  #cir.ptr<null> :  #!cir.ptr<!s8i>,
  #cir.global_view<@_ZTI5Child> : !cir.ptr<!s8i>,
  #cir.global_view<@_ZN5Child9MotherFooEv> : !cir.ptr<!s8i>,
  #cir.global_view<@_ZN6Mother10MotherFoo2Ev> : !cir.ptr<!s8i>]> : !cir.array<!cir.ptr<!s8i> x 4>, 
 #cir.const_array<[
  #cir.ptr<-8> : !cir.ptr<!s8i>,     <--- current: !cir.ptr<18446744073709551608> : !cir.ptr<!u8i>
  #cir.global_view<@_ZTI5Child> : !cir.ptr<!s8i>,
  #cir.global_view<@_ZN6Father9FatherFooEv> : !cir.ptr<!s8i>]
> : !cir.array<!cir.ptr<!s8i> x 3>}> : !ty_anon_struct3
  1. Global variables in the function are placed before the current function. https://github.com/llvm/clangir/blob/5dc33530eaf028ac4467ee19fb0b5b4b8a1f7c92/clang/lib/CIR/CodeGen/CIRGenModule.cpp#L559-L562 But the position of the function may change due to applyReplacements(), which may cause vtable symbols to not be found in the symbol table for cir2mlir lowering (for case clang/test/CIR/CodeGen/vtable-rtti.cpp). So, similar to LLVM IR, Should the insertion of global variables be centralized at the beginning of the module?
bcardosolopes commented 3 months ago

Should the type of RTTI pointers be cir.ptr<!s8i> instead of cir.ptr<!u8i>?

The idea when cir.ptr<!u8i> was first introduced was to basically mean cir.ptr<!void>, which we didn't have at the time. This is more about having a canonical representation for these "opaque" things that we store. Changing might sound like a reasonable approach, but it's possible whenever these values are needed they are being casted to the appropriated type, did you check that? I'm also against changing all of it to s8, ptrs not related to casted offsets should still be u8, and it's just super confusing to readers to make them s8.

llvm ir 12.0.1

This is old, not even using opaque pointers. As a general advice, I suggest you base up your investigations on top of clang-18 or trunk.

So, similar to LLVM IR, Should the insertion of global variables be centralized at the beginning of the module?

We should do a better job, for sure. LLVM codegen has a somewhat lazy approach for global variables, so it's not exactly like you are pointing out. Any improvements here should definitely account for a similar approach from the original LLVM codegen.

Thanks for pointing out (1) and (2), as you noticed, they are pre-requisites into implementing multiple inheritance.

Laity000 commented 3 months ago

Thank you for your response. Please review the discussion about the vtable type in this PR. #569