llvm / clangir

A new (MLIR based) high-level IR for clang.
https://clangir.org
Other
351 stars 95 forks source link

Not completely implemented `CIRGenTypes::getRecordTypeName` #639

Open ChuanqiXu9 opened 4 months ago

ChuanqiXu9 commented 4 months ago

Reproducer:

// test.cc
#include <memory>

std::unique_ptr<char> stack_;
clang++  -std=c++11 -c test.cc -Xclang  -fclangir  -Xclang -emit-cir  -o test.cir

Crash log:

NYI
UNREACHABLE executed at /home/chuanqi.xcq/clangir/clang/lib/CIR/CodeGen/CIRGenTypes.cpp:74!
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.  Program arguments: /home/chuanqi.xcq/clangir/build_debug/bin/clang++ -std=c++11 -c thread.cc -Xclang -fclangir -Xclang -emit-cir -o thread.cir
1.  <eof> parser at end of file
2.  thread.cc:3:23: LLVM IR generation of declaration 'stack_'
 #0 0x00000000088d03fd llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/chuanqi.xcq/clangir/llvm/lib/Support/Unix/Signals.inc:723:11
 #1 0x00000000088d08eb PrintStackTraceSignalHandler(void*) /home/chuanqi.xcq/clangir/llvm/lib/Support/Unix/Signals.inc:798:1
 #2 0x00000000088ce976 llvm::sys::RunSignalHandlers() /home/chuanqi.xcq/clangir/llvm/lib/Support/Signals.cpp:105:5
 #3 0x00000000088cfc8e llvm::sys::CleanupOnSignal(unsigned long) /home/chuanqi.xcq/clangir/llvm/lib/Support/Unix/Signals.inc:368:1
 #4 0x00000000087fd157 (anonymous namespace)::CrashRecoveryContextImpl::HandleCrash(int, unsigned long) /home/chuanqi.xcq/clangir/llvm/lib/Support/CrashRecoveryContext.cpp:0:7
 #5 0x00000000087fd4f5 CrashRecoverySignalHandler(int) /home/chuanqi.xcq/clangir/llvm/lib/Support/CrashRecoveryContext.cpp:391:1
 #6 0x00007ffff7fb5100 __restore_rt sigaction.c:0:0
 #7 0x00007ffff7683605 raise (/lib64/libc.so.6+0x3d605)
 #8 0x00007ffff766c8a2 abort (/lib64/libc.so.6+0x268a2)
 #9 0x00000000088056a0 llvm::install_out_of_memory_new_handler() /home/chuanqi.xcq/clangir/llvm/lib/Support/ErrorHandling.cpp:194:0
#10 0x000000000abc744d cir::CIRGenTypes::getRecordTypeName[abi:cxx11](clang::RecordDecl const*, llvm::StringRef)::$_0::operator()(clang::TemplateArgument const&) const /home/chuanqi.xcq/clangir/clang/lib/CIR/CodeGen/CIRGenTypes.cpp:76:7
#11 0x000000000abc7320 void llvm::interleave<clang::TemplateArgument const*, cir::CIRGenTypes::getRecordTypeName[abi:cxx11](clang::RecordDecl const*, llvm::StringRef)::$_0, void llvm::interleave<llvm::ArrayRef<clang::TemplateArgument>, cir::CIRGenTypes::getRecordTypeName[abi:cxx11](clang::RecordDecl const*, llvm::StringRef)::$_0, llvm::raw_svector_ostream, clang::TemplateArgument const>(llvm::ArrayRef<clang::TemplateArgument> const&, llvm::raw_svector_ostream&, cir::CIRGenTypes::getRecordTypeName[abi:cxx11](clang::RecordDecl const*, llvm::StringRef)::$_0, llvm::StringRef const&)::'lambda'(), void>(llvm::ArrayRef<clang::TemplateArgument>, llvm::ArrayRef<clang::TemplateArgument>, cir::CIRGenTypes::getRecordTypeName[abi:cxx11](clang::RecordDecl const*, llvm::StringRef)::$_0, llvm::raw_svector_ostream) /home/chuanqi.xcq/clangir/llvm/include/llvm/ADT/STLExtras.h:2132:3
#12 0x000000000abc72ce void llvm::interleave<llvm::ArrayRef<clang::TemplateArgument>, cir::CIRGenTypes::getRecordTypeName[abi:cxx11](clang::RecordDecl const*, llvm::StringRef)::$_0, llvm::raw_svector_ostream, clang::TemplateArgument const>(llvm::ArrayRef<clang::TemplateArgument> const&, llvm::raw_svector_ostream&, cir::CIRGenTypes::getRecordTypeName[abi:cxx11](clang::RecordDecl const*, llvm::StringRef)::$_0, llvm::StringRef const&) /home/chuanqi.xcq/clangir/llvm/include/llvm/ADT/STLExtras.h:2154:1
#13 0x000000000abc4301 void llvm::interleaveComma<llvm::ArrayRef<clang::TemplateArgument>, cir::CIRGenTypes::getRecordTypeName[abi:cxx11](clang::RecordDecl const*, llvm::StringRef)::$_0, llvm::raw_svector_ostream, clang::TemplateArgument const>(llvm::ArrayRef<clang::TemplateArgument> const&, llvm::raw_svector_ostream&, cir::CIRGenTypes::getRecordTypeName[abi:cxx11](clang::RecordDecl const*, llvm::StringRef)::$_0) /home/chuanqi.xcq/clangir/llvm/include/llvm/ADT/STLExtras.h:2168:1
#14 0x000000000abc415c cir::CIRGenTypes::getRecordTypeName[abi:cxx11](clang::RecordDecl const*, llvm::StringRef) /home/chuanqi.xcq/clangir/clang/lib/CIR/CodeGen/CIRGenTypes.cpp:78:17
#15 0x000000000abc444e cir::CIRGenTypes::convertRecordDeclType(clang::RecordDecl const*) /home/chuanqi.xcq/clangir/clang/lib/CIR/CodeGen/CIRGenTypes.cpp:193:17
#16 0x000000000abc4b89 cir::CIRGenTypes::ConvertType(clang::QualType) /home/chuanqi.xcq/clangir/clang/lib/CIR/CodeGen/CIRGenTypes.cpp:366:12
#17 0x000000000ac3ec4d (anonymous namespace)::ConstStructBuilder::Finalize(clang::QualType) /home/chuanqi.xcq/clangir/clang/lib/CIR/CodeGen/CIRGenExprConst.cpp:788:37
#18 0x000000000ac369b4 (anonymous namespace)::ConstStructBuilder::BuildStruct(cir::ConstantEmitter&, clang::APValue const&, clang::QualType) /home/chuanqi.xcq/clangir/clang/lib/CIR/CodeGen/CIRGenExprConst.cpp:815:18
#19 0x000000000ac35b07 cir::ConstantEmitter::tryEmitPrivate(clang::APValue const&, clang::QualType) /home/chuanqi.xcq/clangir/clang/lib/CIR/CodeGen/CIRGenExprConst.cpp:1792:12
#20 0x000000000ac34d1d cir::ConstantEmitter::tryEmitPrivateForMemory(clang::APValue const&, clang::QualType) /home/chuanqi.xcq/clangir/clang/lib/CIR/CodeGen/CIRGenExprConst.cpp:1642:12
#21 0x000000000ac3ea9c (anonymous namespace)::ConstStructBuilder::Build(clang::APValue const&, clang::RecordDecl const*, bool, clang::CXXRecordDecl const*, clang::CharUnits) /home/chuanqi.xcq/clangir/clang/lib/CIR/CodeGen/CIRGenExprConst.cpp:764:17
#22 0x000000000ac3e8a7 (anonymous namespace)::ConstStructBuilder::Build(clang::APValue const&, clang::RecordDecl const*, bool, clang::CXXRecordDecl const*, clang::CharUnits) /home/chuanqi.xcq/clangir/clang/lib/CIR/CodeGen/CIRGenExprConst.cpp:736:52
#23 0x000000000ac36972 (anonymous namespace)::ConstStructBuilder::BuildStruct(cir::ConstantEmitter&, clang::APValue const&, clang::QualType) /home/chuanqi.xcq/clangir/clang/lib/CIR/CodeGen/CIRGenExprConst.cpp:812:7
#24 0x000000000ac35b07 cir::ConstantEmitter::tryEmitPrivate(clang::APValue const&, clang::QualType) /home/chuanqi.xcq/clangir/clang/lib/CIR/CodeGen/CIRGenExprConst.cpp:1792:12
#25 0x000000000ac34d1d cir::ConstantEmitter::tryEmitPrivateForMemory(clang::APValue const&, clang::QualType) /home/chuanqi.xcq/clangir/clang/lib/CIR/CodeGen/CIRGenExprConst.cpp:1642:12
#26 0x000000000ac3ea9c (anonymous namespace)::ConstStructBuilder::Build(clang::APValue const&, clang::RecordDecl const*, bool, clang::CXXRecordDecl const*, clang::CharUnits) /home/chuanqi.xcq/clangir/clang/lib/CIR/CodeGen/CIRGenExprConst.cpp:764:17
#27 0x000000000ac36972 (anonymous namespace)::ConstStructBuilder::BuildStruct(cir::ConstantEmitter&, clang::APValue const&, clang::QualType) /home/chuanqi.xcq/clangir/clang/lib/CIR/CodeGen/CIRGenExprConst.cpp:812:7
#28 0x000000000ac35b07 cir::ConstantEmitter::tryEmitPrivate(clang::APValue const&, clang::QualType) /home/chuanqi.xcq/clangir/clang/lib/CIR/CodeGen/CIRGenExprConst.cpp:1792:12
#29 0x000000000ac34d1d cir::ConstantEmitter::tryEmitPrivateForMemory(clang::APValue const&, clang::QualType) /home/chuanqi.xcq/clangir/clang/lib/CIR/CodeGen/CIRGenExprConst.cpp:1642:12
#30 0x000000000ac31776 cir::ConstantEmitter::tryEmitPrivateForVarInit(clang::VarDecl const&) /home/chuanqi.xcq/clangir/clang/lib/CIR/CodeGen/CIRGenExprConst.cpp:1580:12
#31 0x000000000ac31456 cir::ConstantEmitter::tryEmitForInitializer(clang::VarDecl const&) /home/chuanqi.xcq/clangir/clang/lib/CIR/CodeGen/CIRGenExprConst.cpp:1497:23
#32 0x000000000ababbe2 cir::CIRGenModule::buildGlobalVarDefinition(clang::VarDecl const*, bool) /home/chuanqi.xcq/clangir/clang/lib/CIR/CodeGen/CIRGenModule.cpp:981:33
#33 0x000000000aba9185 cir::CIRGenModule::buildGlobalDefinition(clang::GlobalDecl, mlir::Operation*) /home/chuanqi.xcq/clangir/clang/lib/CIR/CodeGen/CIRGenModule.cpp:1184:5
#34 0x000000000aba7f42 cir::CIRGenModule::buildGlobal(clang::GlobalDecl) /home/chuanqi.xcq/clangir/clang/lib/CIR/CodeGen/CIRGenModule.cpp:431:5
#35 0x000000000abadcf0 cir::CIRGenModule::buildTopLevelDecl(clang::Decl*) /home/chuanqi.xcq/clangir/clang/lib/CIR/CodeGen/CIRGenModule.cpp:1393:5
#36 0x000000000aba46c0 cir::CIRGenerator::HandleTopLevelDecl(clang::DeclGroupRef) /home/chuanqi.xcq/clangir/clang/lib/CIR/CodeGen/CIRGenerator.cpp:84:67
#37 0x000000000a975dda cir::CIRGenConsumer::HandleTopLevelDecl(clang::DeclGroupRef) /home/chuanqi.xcq/clangir/clang/lib/CIR/FrontendAction/CIRGenAction.cpp:140:3
#38 0x000000000f02c26d clang::ParseAST(clang::Sema&, bool, bool) /home/chuanqi.xcq/clangir/clang/lib/Parse/ParseAST.cpp:167:11
#39 0x0000000009da9f84 clang::ASTFrontendAction::ExecuteAction() /home/chuanqi.xcq/clangir/clang/lib/Frontend/FrontendAction.cpp:1214:1
#40 0x000000000a974b8f cir::CIRGenAction::ExecuteAction() /home/chuanqi.xcq/clangir/clang/lib/CIR/FrontendAction/CIRGenAction.cpp:407:5
#41 0x0000000009da999c clang::FrontendAction::Execute() /home/chuanqi.xcq/clangir/clang/lib/Frontend/FrontendAction.cpp:1102:7
#42 0x0000000009cc51aa clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) /home/chuanqi.xcq/clangir/clang/lib/Frontend/CompilerInstance.cpp:1062:23
#43 0x0000000009f80c52 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) /home/chuanqi.xcq/clangir/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:342:8
#44 0x0000000006a7b861 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) /home/chuanqi.xcq/clangir/clang/tools/driver/cc1_main.cpp:232:13
#45 0x0000000006a6e1f2 ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) /home/chuanqi.xcq/clangir/clang/tools/driver/driver.cpp:215:5
#46 0x0000000006a6ec9d clang_main(int, char**, llvm::ToolContext const&)::$_0::operator()(llvm::SmallVectorImpl<char const*>&) const /home/chuanqi.xcq/clangir/clang/tools/driver/driver.cpp:355:7
#47 0x0000000006a6ec6d int llvm::function_ref<int (llvm::SmallVectorImpl<char const*>&)>::callback_fn<clang_main(int, char**, llvm::ToolContext const&)::$_0>(long, llvm::SmallVectorImpl<char const*>&) /home/chuanqi.xcq/clangir/llvm/include/llvm/ADT/STLFunctionalExtras.h:45:5
#48 0x0000000009b55921 llvm::function_ref<int (llvm::SmallVectorImpl<char const*>&)>::operator()(llvm::SmallVectorImpl<char const*>&) const /home/chuanqi.xcq/clangir/llvm/include/llvm/ADT/STLFunctionalExtras.h:68:5
#49 0x0000000009b526c8 clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const::$_0::operator()() const /home/chuanqi.xcq/clangir/clang/lib/Driver/Job.cpp:440:34
#50 0x0000000009b52695 void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const::$_0>(long) /home/chuanqi.xcq/clangir/llvm/include/llvm/ADT/STLFunctionalExtras.h:45:5
#51 0x0000000007512ef9 llvm::function_ref<void ()>::operator()() const /home/chuanqi.xcq/clangir/llvm/include/llvm/ADT/STLFunctionalExtras.h:68:5
#52 0x00000000087fcf4a llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) /home/chuanqi.xcq/clangir/llvm/lib/Support/CrashRecoveryContext.cpp:427:3
#53 0x0000000009b5202b clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const /home/chuanqi.xcq/clangir/clang/lib/Driver/Job.cpp:440:7
#54 0x0000000009aed53a clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&, bool) const /home/chuanqi.xcq/clangir/clang/lib/Driver/Compilation.cpp:199:15
#55 0x0000000009aed747 clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&, bool) const /home/chuanqi.xcq/clangir/clang/lib/Driver/Compilation.cpp:253:13
#56 0x0000000009b07e12 clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&) /home/chuanqi.xcq/clangir/clang/lib/Driver/Driver.cpp:1927:7
#57 0x0000000006a6dccb clang_main(int, char**, llvm::ToolContext const&) /home/chuanqi.xcq/clangir/clang/tools/driver/driver.cpp:391:9
#58 0x0000000006aa03a5 main /home/chuanqi.xcq/clangir/build_debug/tools/clang/tools/driver/clang-driver.cpp:17:3
#59 0x00007ffff766e192 __libc_start_main (/lib64/libc.so.6+0x28192)
#60 0x0000000006a6c9ee _start (/home/chuanqi.xcq/clangir/build_debug/bin/clang+++0x6a6c9ee)
clang++: error: clang frontend command failed with exit code 134 (use -v to see invocation)
clang version 19.0.0git (git@github.com:llvm/clangir.git d2e09591657e61669bc9f6ebcbf0064668a23e27)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/chuanqi.xcq/clangir/build_debug/bin
Build config: +unoptimized, +assertions
clang++: error: unable to execute command: Aborted
clang++: note: diagnostic msg: Error generating preprocessed source(s).

The error leads to:

https://github.com/llvm/clangir/blob/eaae025754b5c58f2ede4787d328d79a0a671b94/clang/lib/CIR/CodeGen/CIRGenTypes.cpp#L73-L74

and it leads to:

https://github.com/llvm/clangir/blob/eaae025754b5c58f2ede4787d328d79a0a671b94/clang/lib/CIR/CodeGen/CIRGenBuilder.h#L457-L460

So it looks like we're going to print the name of type and get the attribute from it. Then I get confused. How can we get the attributes from the name of the type?

Lancern commented 4 months ago

The important difference between CIR and LLVM IR here is how they distinguish different struct types.

In LLVM IR, different llvm::StructType objects represent different struct types. The name of a llvm::StructType is more like a "comment" that makes the assembly more readable. Thus in the original CodeGen we just have to make a "rough" name that does not have to be precise.

On contrary, in CIR, two different mlir::cir::StructType objects can represent the same struct type as long as they have the same name and same kind (see here for how CIR compares two mlir::cir::StructType objects). Thus in CIR we have to invent a precise "mangled" name for each struct type, as the mangled struct names will be used for equality comparison.

ChuanqiXu9 commented 4 months ago

Thanks for explanation. It sounds like an ODR violation to me if we have two different StructType with the same name. What's the reason for the design?

And what's the concern that we don't print the name for other template arguments? Or did we simply not implement it? Also do we have back compatibility requirement for CIR? If not, and I assuming we don't have other consumers for CIR outside the LLVM repo, it looks like it is not so hard to mangle the name.

Lancern commented 4 months ago

It sounds like an ODR violation to me if we have two different StructType with the same name. What's the reason for the design?

Are you asking about why we need to additionally compare the struct kind besides the mangled name, which can already uniquely identify a struct type? I'm not quite sure about this, but the git history shows the relevant part is commited in https://github.com/llvm/clangir/commit/e776625354f08bfada43bc3ef085e1c5a0888222 . Maybe we can ask the author about this :)

And what's the concern that we don't print the name for other template arguments? Or did we simply not implement it?

We simply has not implemented it yet. "NYI" represents "not yet implemented" and in CIR by convention we use llvm_unreachable("NYI") to make clang crash on unimplemented features.

Also do we have back compatibility requirement for CIR?

We'd better confirm with @bcardosolopes about this :)

If not, and I assuming we don't have other consumers for CIR outside the LLVM repo, it looks like it is not so hard to mangle the name.

Agree. IMHO if the name is merely used for distinguishing different struct types we can have easier ways to build the mangled name (with an increasing counter, for example).

ChuanqiXu9 commented 4 months ago

Are you asking about why we need to additionally compare the struct kind besides the mangled name, which can already uniquely identify a struct type?

I mean, in a valid C++ program, it is not allowed to have different struct definitions with the same name. It violates the ODR (One Definition Rule).

bcardosolopes commented 4 months ago

I'll take a more careful look at this tomorrow and reply to all comments. In the meantime, @sitio-couto worked a bunch on this and might have feedback here.

sitio-couto commented 4 months ago

Hey @ChuanqiXu9! I'm unsure if I understood the core issue here, but let me try to answer your questions.

  1. How can we get the attributes from the name of the type? If you want to get the type of a template argument, there is no current way to do this from CIR. You'll need to query the AST. If you want the type of the struct members, you can use the getMembers function defined in StructType.

  2. It sounds like an ODR violation different StructType with the same name. What's the reason for the design? This is not the design. We ensure unique names for struct types of the same kind. The function you pointed out, creates a unique name for each template instance. Also, when lowering to LLVM Dialect, we ensure the Kind attribute is added as a prefix in the LLVM struct name through the getPrefixedName method.

  3. we don't print the name for other template arguments? Or did we simply not implement it? We simply haven't implemented it. Keep in mind that the header you are including (#include <memory>) has several unimplemented features besides the one responsible for the error you found.

  4. Also do we have back compatibility requirement for CIR? I'm not sure I understood this question. Do you mean if CIR aims to compile all legacy C/C++ code? If so, yes.

  5. Regarding mangled struct names. AFAIK, this is a part of ABI-related codegen/lowering that we do not yet address. Using the mangled name as a unique name instead of the approach described in item 2 would be one possibility. We simply haven't addressed this yet.

  6. What is causing this error? We haven't implemented the approach described in item 2 for certain template arguments (packed arguments for example template<ypename ...Args>), so it falls on the default clause. To fix this, you can add the required logic to handle the missing case statement to create the unique name.

Hope this helps!

bcardosolopes commented 4 months ago

Thanks @sitio-couto for the great explanation!

ChuanqiXu9 commented 4 months ago

Thanks for the very clear explanation!

Hey @ChuanqiXu9! I'm unsure if I understood the core issue here, but let me try to answer your questions.

  1. How can we get the attributes from the name of the type? If you want to get the type of a template argument, there is no current way to do this from CIR. You'll need to query the AST. If you want the type of the struct members, you can use the getMembers function defined in StructType.
  2. It sounds like an ODR violation different StructType with the same name. What's the reason for the design? This is not the design. We ensure unique names for struct types of the same kind. The function you pointed out, creates a unique name for each template instance. Also, when lowering to LLVM Dialect, we ensure the Kind attribute is added as a prefix in the LLVM struct name through the getPrefixedName method.
  3. we don't print the name for other template arguments? Or did we simply not implement it? We simply haven't implemented it. Keep in mind that the header you are including (#include <memory>) has several unimplemented features besides the one responsible for the error you found.
  4. Also do we have back compatibility requirement for CIR? I'm not sure I understood this question. Do you mean if CIR aims to compile all legacy C/C++ code? If so, yes.

I mean, if the CIR produced by clang version X, can be consumed by clang version Y, given Y > X. If yes, it has backward compatibility. The examples have backward compatibility may be the DWARF information and the Itanium CXX ABI. The examples don't have backward compatibility may be the LLVM IR and C++ Modules.

  1. Regarding mangled struct names. AFAIK, this is a part of ABI-related codegen/lowering that we do not yet address. Using the mangled name as a unique name instead of the approach described in item 2 would be one possibility. We simply haven't addressed this yet.

Interesting. I didn't thought the mangled name is different from the unique name.

BTW, it sounds slightly odd to me that we mangle the struct names due to ABI reasons in CIR level. Since I think CIR is a C++ specific IR and so we should preserve more C++ information as much as possible. Then we can mangle it in the phase of Lowering.

  1. What is causing this error? We haven't implemented the approach described in item 2 for certain template arguments (packed arguments for example template<ypename ...Args>), so it falls on the default clause. To fix this, you can add the required logic to handle the missing case statement to create the unique name.

Yeah, I did that locally and passes this error. (And meet more errors, as expected : )) I am still in the phase experiencing CIR.


BTW, since I'd like to proposing analysis only CIR to make it usable quickly. I find the existing NYI mechanism is way too strict for analysis only purpose. It makes sense to boil out if we're going to generating an executable and we're not done. But if we're only going to analysis something, it looks not necessary to boil out as long as the missing piece won't cause the analysis pass to produce incorrect results.

For example, whether or not we generate the initializer for dynamic TLS shouldn't matter if we only want to analysis the code, so I'd make such changes:

    if (D->getTLSKind()) {
+     if (!isAnalysisOnly() && 
           D->getTLSKind() == VarDecl::TLS_Dynamic)
        llvm_unreachable("NYI");
      setTLSMode(GV, *D);
    }

What do you think about this? @bcardosolopes @sitio-couto @Lancern

Hope this helps!

You are welcome!

bcardosolopes commented 4 months ago

I mean, if the CIR produced by clang version X, can be consumed by clang version Y, given Y > X. If yes, it has backward compatibility. The examples have backward compatibility may be the DWARF information and the Itanium CXX ABI. The examples don't have backward compatibility may be the LLVM IR and C++ Modules.

We have no plans for backward compatibility yet, it's too early for that. It's a possibility once CIR is mature enough, but right now it's evolving fast and won't bring any benefits to any current users.

BTW, it sounds slightly odd to me that we mangle the struct names due to ABI reasons in CIR level.

Turns out mangling is an integral part of how C++ works, and pretty much how a lot of things are encoded (e.g. multiple compiler syntethized default ctors, etc). Using mangled names greatly improve symbol lookup in MLIR and make it convenient to understand common post ABI decisions that are usually implicitly encoded. Using "real" names instead of mangling poses some interesting challenge because in some way or another we would have to reinvent part of the wheel to deal with namespaces, and other things that would differentiate one symbol from the other - it's not trivial as one might think.

It doesn't feel odd to me, but you are free to have opinions.

Since I think CIR is a C++ specific IR and so we should preserve more C++ information as much as possible. Then we can mangle it in the phase of Lowering.

Backreferences to the AST could be available whenever someone wants more "raw" information.

BTW, since I'd like to proposing analysis only CIR to make it usable quickly.

Defining "usable quickly" is also tricky, the project is at a state that unless you put effort to implement the things you need, it won't be useful.

I find the existing NYI mechanism is way too strict for analysis only purpose. It makes sense to boil out if we're going to generating an executable and we're not done.

This is also not as simple as it seems. One example is when we don't yet generate a CIR operation for a given set of AST nodes, a given analysis could be completely wrong because we are basically deleting code it should be looking at. My experience with linting at Meta tells me that inaccuracy is the main cause people turn checks off or start ignoring them. I rather we be more conservative in the short term, while aiming at more reliability in the future.

But if we're only going to analysis something, it looks not necessary to boil out as long as the missing piece won't cause the analysis pass to produce incorrect results. ... For example, whether or not we generate the initializer for dynamic TLS shouldn't matter if we only want to analysis the code

How do you even know if the missing pieces are not going to produce incorrect results? On a case basis looks like shortcut to me. It also makes an analysis hard to debug later, because we don't even know where to start debugging.

On the TLS case, it might not be relevant to you, but it's possible someone is writing a thread-like analysis that could benefit from this information. We are not interested in providing half baked tools to the community, we'd prefer that interested users put their time to make things work properly instead of taking shortcuts.

What do you think about this?

I don't feel comfortable with this approach. I'd find reasonable here would be to abstract all NYI into a helper function that as part of its implementation could optionally return early and never assert. However, I'd be opposed to turning that on for analysis-only because it will be basically bad PR for the quality of analysis results we can give. I rather we take longer to have something usable than having something usable but wrong that might scare people away from trying it again. Nothing should prevent you from turning it ON in your downstream version of clangir though.

ChuanqiXu9 commented 4 months ago

Thanks for the clear response. Yeah, it is find to extract the NYI things into a configurable function. At least it won't cause a lots of scaring crashes.