Open ChuanqiXu9 opened 1 year ago
@llvm/issue-subscribers-clang-modules
@ChuanqiXu9 I may be hitting this. I've got some files taking around 45 seconds to compile and it's killing me. I wonder if there is a temporary workaround that I could apply to LLVM source locally, without the more well-thought-out fix that you will need to apply. I'm interested in compilers and hacking on Clang, so I could try it. Can you say any more about what you found in ASTWriter::WriteDecl
that might be the problem?
It could help to upload a minimal reproducer and a flame graph. The most significant difference when importing a module is that clang::FindExternalVisibleDeclsByName
gets called which seems to be slow with big modules.
@ChuanqiXu9 already linked a thread with further info: https://discourse.llvm.org/t/modules-increased-build-times/68755. I only have files that take 5 seconds longer to compile and my reproducer is far from minimal, so a better reproducer could help.
@rnikander if you want to hack on it, you could try to modify: https://github.com/llvm/llvm-project/blob/e58a49300e757ff61142f6abd227bd1437c1cf87/clang/lib/Serialization/ASTWriterDecl.cpp#L2490-L2532
@rnikander if you want to hack on it, you could try to modify:
This might not be related to the problem.
@ChuanqiXu9 I may be hitting this. I've got some files taking around 45 seconds to compile and it's killing me. I wonder if there is a temporary workaround that I could apply to LLVM source locally, without the more well-thought-out fix that you will need to apply. I'm interested in compilers and hacking on Clang, so I could try it. Can you say any more about what you found in
ASTWriter::WriteDecl
that might be the problem?
I didn't find a quick solution/workaround for this. The direct cause for the problem is that for this code style:
// a.cppm
module;
#include <iostream>
export module;
import another; // another contains many decls <iostream>
When we merge the declarations from different modules, the declarations in this TU (a.cppm) will be the canonical declarations and the declarations from the imported modules will be the redeclarations. So we have to re-write the declarations in this TU. Also when we write declarations in this TU, we'll try to lookup the decls from the external sources (the imported modules), for example: https://github.com/llvm/llvm-project/blob/edc8b602f996a1fc68c28054c81e8fb671bb874b/clang/lib/Serialization/ASTWriter.cpp#L3960-L3965
This is what I found now. And as you see, it is pretty fundamental so I am not sure if there is a simple workaround.
As a dirty (and absolutely incorrect) verifier, you can insert the following code to: https://github.com/llvm/llvm-project/blob/edc8b602f996a1fc68c28054c81e8fb671bb874b/clang/lib/Serialization/ASTWriter.cpp#L5578
for (auto *RD : D->redecls()) {
if (RD == D)
continue;
if (RD->isFromASTFile()) {
DeclIDs[D] = RD->getGlobalID();
return DeclIDs[D];
}
}
Note that I don't think this is the correct direction. I think we need to do something in the lookup process if possible.
@rnikander and for the problem of slow compilation, I am wondering if some of the reason comes from the one phase compilation. For one phase compilation and two phase compilation, you can look this at: https://clang.llvm.org/docs/StandardCPlusPlusModules.html#how-to-produce-a-bmi and https://gitlab.kitware.com/cmake/cmake/-/issues/18355#note_1329192. And cmake choose to implement one phase model initially for the sake of compatibility. But two phase compilation may be faster locally. If the scale of your project isn't too large, you can try to implement the 2 phase compilation model by cmake scripts. Then we will have a better feeling for this.
Okay, thank you. I'll dig around to try to understand better. I'd like to work on Clang but realistically I probably should focus on my projects.
So I guess this process of merging declarations from modules, which you mentioned, has to occur even with the declarations in the another
module that are private to it, like the #include <iostream>
in it's global module fragment?
I can't compile anything at the moment because main
branch is segfaulting (another issue, someone already reported). I'll look again tomorrow.
I'd like to work on Clang but realistically I probably should focus on my projects.
Never mind. This is what open source is : )
So I guess this process of merging declarations from modules, which you mentioned, has to occur even with the declarations in the another module that are private to it, like the #include
in it's global module fragment?
The deserialization and visibility checking are two related but different process. The things in global module fragment may be reachable so we have to deserialize them now.
I sent cf47e9f to mitigate the problem. The patch changes the writing BMI time of test.cppm from 5s to 3s in my environment. But 3s is still longer than I expected. So I'll keep this openning.
Note for developers: the root cause for the issue is that when we reconcile the declcontext redundantly many times. See the last section of ASTReader::loadDeclUpdateRecords
. We will reconcile the decl context completely instead of by need, which may be a major reason for this issue.
With -fexperimental-modules-reduced-bmi
, the compile time for https://github.com/koplas/clang-modules-test goes down to 2.685s from 3.486s my local machine. So it can mitigate the problem a little bit but still not ideal.
I feel the problem may be somewhat pretty hard to fix (or optimize) in clang. Possibly we can only teach users don't do this. e.g. https://clang.llvm.org/docs/StandardCPlusPlusModules.html#performance-tips
(Maybe we don't see so many issues if we're born within a world where modules came in day 1.)
CC @koplas
Oh, I took another look at the example,
module;
#include <iostream>
export module test;
import vulkan;
export {
void hello() {
std::cout << "hello\n";
}
}
Since we don't want to write the function body for non-inline function, we shouldn't have so many work in ASTWriter for sure. Although we can't solve the underlying issue, we can optimize the specific example.
(It means it will be slow too if the function is inline)
Thank you @ChuanqiXu9 for the progress that has been made.
My problem lies in the usage of third-party libraries, that are used like this:
module;
#include "big_header.hpp"
export module third_party;
export {
using decl_1;
using decl_2;
...
}
The export-extern-style would certainly fix the performance issue for me, but it would hard to port all third-party libraries to this style. Even when std modules and https://github.com/llvm/llvm-project/issues/80663 reduce the impact, it can be a major performance hurdle for new module users. To be able to import a lot of big third-party libraries without a big performance impact would be the biggest C++ modules selling point for me. I wouldn't mind if the ASTWriter is still slow for the inline function case.
Yeah, this is more of the topic of how to transfer a header based library to a module based one. And the biggest challenge I saw is how to handle third party libraries. Then it is somewhat a annoying issue. We feel it will be easy if all the third party libraries provide the modules. Note that this definition is recursive. For example, if someday boost provides a modules, ideally, it will be best if it imports std instead of wrapping itself like std module. The std module is special since it is the root of dependencies. Another example is, now it may be fine to write a new module-native library only depends on std module.
But the bad problem is, how can we treat the third party libraries not offering modules? This is annoying since we're doing other people's job. Ideally, it should be the job of other libraries to provide such thing nicely. But now we need to worry about that.
Then it becomes more of an ecosystem problem.
A bit of a question of understanding. Consider this:
module;
#include <iostream>
export module test;
export {
void hello() {}
}
The resulting pcm file compiled with reduced bmi is 3,4 Megabytes big. If I add import vulkan;
it increases to 5,5 Megabytes. Shouldn't both files be relatively small, as they use nothing from the GMF?
Update: Without reduced bmi the respective file size is 4,5 and 7,4 Megabytes. The performance improvement seems to be linear to the file size.
The resulting pcm file compiled with reduced bmi is 3,4 Megabytes big. If I add import vulkan; it increases to 5,5 Megabytes. Shouldn't both files be relatively small, as they use nothing from the GMF?
This is a problem of quality of implementation. The reduced BMI is still in the very early stage. We're still optimizing it. And this the problem I said why reduced BMI doesn't work. Since we don't write entities in non-inline function bodies, we simply shouldn't write any thing.
I just make a new patch to improve this. After this patch, with reduced BMI, the result of https://github.com/koplas/clang-modules-test goes to:
Building test.cppm
real 0m0.494s
user 0m0.450s
sys 0m0.037s
Building test-without-import.cppm
real 0m0.466s
user 0m0.430s
sys 0m0.033s
The number looks good and we're probably near the extreme. import <module-name>
is not a free operation.
But I still want to emphasize that, it doesn't fix the fundamental issue. It will still be slow if we're writing a declaration which has a lot redeclarations in the imported modules. In another word, in the example of https://github.com/koplas/clang-modules-test, if every partition unit of module vulkan uses declarations from <iostream>
extensively (not in a non-inline function body, which we just skipped), the writing performance should be pretty slow still. So I don't mind keep the issue opening.
Also from my testing, it may not not a big herotic optimization to real world examples, https://github.com/alibaba/async_simple/tree/CXX20Modules (I tried to make it modules native), I only observed 1%+ size reduction and rough 2%+ compile speed improvements.
I just tested it on my toy project, and the time spent in the frontend is now twice as large as the time spent in the ASTWriter. So I consider it a significant improvement, I guess a switch to std modules would reduce the remaining performance impact.
Currently one file crashes with -fexperimental-modules-reduced-bmi
:
/usr/local/bin/clang++ -I/home/paul/Documents/lime-engine/third_party/include -I/home/paul/Documents/lime-engine/third_party/glm -I/home/paul/Documents/lime-engine/third_party/glfw/include -stdlib=libc++ -Wall -ftime-trace -fno-rtti -mavx2 -fno-exceptions -fexperimental-modules-reduced-bmi -fomit-frame-pointer -fno-exceptions -fno-asynchronous-unwind-tables -Wno-nullability-completeness -ferror-limit=0 -g -std=gnu++20 -fcolor-diagnostics -MD -MT CMakeFiles/lime-engine.dir/src/app.ixx.o -MF CMakeFiles/lime-engine.dir/src/app.ixx.o.d @CMakeFiles/lime-engine.dir/src/app.ixx.o.modmap -o CMakeFiles/lime-engine.dir/src/app.ixx.o -c /home/paul/Documents/lime-engine/src/app.ixx
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0. Program arguments: /usr/local/bin/clang++ -I/home/paul/Documents/lime-engine/third_party/include -I/home/paul/Documents/lime-engine/third_party/glm -I/home/paul/Documents/lime-engine/third_party/glfw/include -stdlib=libc++ -Wall -ftime-trace -fno-rtti -mavx2 -fno-exceptions -fexperimental-modules-reduced-bmi -fomit-frame-pointer -fno-exceptions -fno-asynchronous-unwind-tables -Wno-nullability-completeness -ferror-limit=0 -g -std=gnu++20 -fcolor-diagnostics -MD -MT CMakeFiles/lime-engine.dir/src/app.ixx.o -MF CMakeFiles/lime-engine.dir/src/app.ixx.o.d @CMakeFiles/lime-engine.dir/src/app.ixx.o.modmap -o CMakeFiles/lime-engine.dir/src/app.ixx.o -c /home/paul/Documents/lime-engine/src/app.ixx
1. <eof> parser at end of file
2. Per-file LLVM IR generation
3. /usr/local/bin/../include/c++/v1/new:268:29: Generating code for declaration 'std::__libcpp_operator_new'
#0 0x00005599ecbb4496 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/usr/local/bin/clang+++0x3025496)
#1 0x00005599ecbb207e llvm::sys::RunSignalHandlers() (/usr/local/bin/clang+++0x302307e)
#2 0x00005599ecb23450 CrashRecoverySignalHandler(int) CrashRecoveryContext.cpp:0:0
#3 0x00007faf8d65a770 (/usr/lib/libc.so.6+0x40770)
#4 0x00005599ec6f1af4 llvm::Value::getContext() const (/usr/local/bin/clang+++0x2b62af4)
#5 0x00005599ec68275c llvm::StoreInst::StoreInst(llvm::Value*, llvm::Value*, bool, llvm::Align, llvm::AtomicOrdering, unsigned char, llvm::Instruction*) (/usr/local/bin/clang+++0x2af375c)
#6 0x00005599ec6826cf llvm::StoreInst::StoreInst(llvm::Value*, llvm::Value*, bool, llvm::Align, llvm::Instruction*) (/usr/local/bin/clang+++0x2af36cf)
#7 0x00005599eaacd3be llvm::IRBuilderBase::CreateAlignedStore(llvm::Value*, llvm::Value*, llvm::MaybeAlign, bool) (/usr/local/bin/clang+++0xf3e3be)
#8 0x00005599ece029d5 clang::CodeGen::CodeGenFunction::EmitReturnStmt(clang::ReturnStmt const&) (/usr/local/bin/clang+++0x32739d5)
#9 0x00005599ece0b360 clang::CodeGen::CodeGenFunction::EmitCompoundStmtWithoutScope(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/usr/local/bin/clang+++0x327c360)
#10 0x00005599ece65d06 clang::CodeGen::CodeGenFunction::GenerateCode(clang::GlobalDecl, llvm::Function*, clang::CodeGen::CGFunctionInfo const&) (/usr/local/bin/clang+++0x32d6d06)
#11 0x00005599ece889af clang::CodeGen::CodeGenModule::EmitGlobalFunctionDefinition(clang::GlobalDecl, llvm::GlobalValue*) (/usr/local/bin/clang+++0x32f99af)
#12 0x00005599ece817a1 clang::CodeGen::CodeGenModule::EmitGlobalDefinition(clang::GlobalDecl, llvm::GlobalValue*) (/usr/local/bin/clang+++0x32f27a1)
#13 0x00005599ece74dc3 clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang+++0x32e5dc3)
#14 0x00005599ece74ddf clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang+++0x32e5ddf)
#15 0x00005599ece74ddf clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang+++0x32e5ddf)
#16 0x00005599ece74ddf clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang+++0x32e5ddf)
#17 0x00005599ece74ddf clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang+++0x32e5ddf)
#18 0x00005599ece74ddf clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang+++0x32e5ddf)
#19 0x00005599ece74ddf clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang+++0x32e5ddf)
#20 0x00005599ece74ddf clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang+++0x32e5ddf)
#21 0x00005599ece74ddf clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang+++0x32e5ddf)
#22 0x00005599ece74ddf clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang+++0x32e5ddf)
#23 0x00005599ece74ddf clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang+++0x32e5ddf)
#24 0x00005599ece74ddf clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang+++0x32e5ddf)
#25 0x00005599ece74ddf clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang+++0x32e5ddf)
#26 0x00005599ece74ddf clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang+++0x32e5ddf)
#27 0x00005599ece74ddf clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang+++0x32e5ddf)
#28 0x00005599ece74ddf clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang+++0x32e5ddf)
#29 0x00005599ece74ddf clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang+++0x32e5ddf)
#30 0x00005599ece74ddf clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang+++0x32e5ddf)
#31 0x00005599ece7259b clang::CodeGen::CodeGenModule::Release() (/usr/local/bin/clang+++0x32e359b)
#32 0x00005599ed25fc18 (anonymous namespace)::CodeGeneratorImpl::HandleTranslationUnit(clang::ASTContext&) ModuleBuilder.cpp:0:0
#33 0x00005599ed25700b clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (/usr/local/bin/clang+++0x36c800b)
#34 0x00005599ed53f7bc clang::MultiplexConsumer::HandleTranslationUnit(clang::ASTContext&) (/usr/local/bin/clang+++0x39b07bc)
#35 0x00005599eed04549 clang::ParseAST(clang::Sema&, bool, bool) (/usr/local/bin/clang+++0x5175549)
#36 0x00005599ed4fbf14 clang::FrontendAction::Execute() (/usr/local/bin/clang+++0x396cf14)
#37 0x00005599ed464854 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/usr/local/bin/clang+++0x38d5854)
#38 0x00005599ed5c8d50 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/usr/local/bin/clang+++0x3a39d50)
#39 0x00005599eaa81f21 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/usr/local/bin/clang+++0xef2f21)
#40 0x00005599eaa7f1d0 ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) driver.cpp:0:0
#41 0x00005599ed2a6b29 void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const::$_0>(long) Job.cpp:0:0
#42 0x00005599ecb231ec llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) (/usr/local/bin/clang+++0x2f941ec)
#43 0x00005599ed2a64c3 clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const (/usr/local/bin/clang+++0x37174c3)
#44 0x00005599ed2646a2 clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&, bool) const (/usr/local/bin/clang+++0x36d56a2)
#45 0x00005599ed2648fe clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&, bool) const (/usr/local/bin/clang+++0x36d58fe)
#46 0x00005599ed28402f clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&) (/usr/local/bin/clang+++0x36f502f)
#47 0x00005599eaa7e857 clang_main(int, char**, llvm::ToolContext const&) (/usr/local/bin/clang+++0xeef857)
#48 0x00005599eaa8d486 main (/usr/local/bin/clang+++0xefe486)
#49 0x00007faf8d643cd0 (/usr/lib/libc.so.6+0x29cd0)
#50 0x00007faf8d643d8a __libc_start_main (/usr/lib/libc.so.6+0x29d8a)
#51 0x00005599eaa7cc75 _start (/usr/local/bin/clang+++0xeedc75)
The file looks like this:
module;
export module app;
import lime.Game;
export int app_main() {
lime::Game game = {};
return 0;
}
If I use lime::Game *game = new lime::Game{};
the crash doesn't occur.
I just tested it on my toy project, and the time spent in the frontend is now twice as large as the time spent in the ASTWriter. So I consider it a significant improvement, I guess a switch to std modules would reduce the remaining performance impact. Currently one file crashes with
-fexperimental-modules-reduced-bmi
:/usr/local/bin/clang++ -I/home/paul/Documents/lime-engine/third_party/include -I/home/paul/Documents/lime-engine/third_party/glm -I/home/paul/Documents/lime-engine/third_party/glfw/include -stdlib=libc++ -Wall -ftime-trace -fno-rtti -mavx2 -fno-exceptions -fexperimental-modules-reduced-bmi -fomit-frame-pointer -fno-exceptions -fno-asynchronous-unwind-tables -Wno-nullability-completeness -ferror-limit=0 -g -std=gnu++20 -fcolor-diagnostics -MD -MT CMakeFiles/lime-engine.dir/src/app.ixx.o -MF CMakeFiles/lime-engine.dir/src/app.ixx.o.d @CMakeFiles/lime-engine.dir/src/app.ixx.o.modmap -o CMakeFiles/lime-engine.dir/src/app.ixx.o -c /home/paul/Documents/lime-engine/src/app.ixx PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script. Stack dump: 0. Program arguments: /usr/local/bin/clang++ -I/home/paul/Documents/lime-engine/third_party/include -I/home/paul/Documents/lime-engine/third_party/glm -I/home/paul/Documents/lime-engine/third_party/glfw/include -stdlib=libc++ -Wall -ftime-trace -fno-rtti -mavx2 -fno-exceptions -fexperimental-modules-reduced-bmi -fomit-frame-pointer -fno-exceptions -fno-asynchronous-unwind-tables -Wno-nullability-completeness -ferror-limit=0 -g -std=gnu++20 -fcolor-diagnostics -MD -MT CMakeFiles/lime-engine.dir/src/app.ixx.o -MF CMakeFiles/lime-engine.dir/src/app.ixx.o.d @CMakeFiles/lime-engine.dir/src/app.ixx.o.modmap -o CMakeFiles/lime-engine.dir/src/app.ixx.o -c /home/paul/Documents/lime-engine/src/app.ixx 1. <eof> parser at end of file 2. Per-file LLVM IR generation 3. /usr/local/bin/../include/c++/v1/new:268:29: Generating code for declaration 'std::__libcpp_operator_new' #0 0x00005599ecbb4496 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/usr/local/bin/clang+++0x3025496) #1 0x00005599ecbb207e llvm::sys::RunSignalHandlers() (/usr/local/bin/clang+++0x302307e) #2 0x00005599ecb23450 CrashRecoverySignalHandler(int) CrashRecoveryContext.cpp:0:0 #3 0x00007faf8d65a770 (/usr/lib/libc.so.6+0x40770) #4 0x00005599ec6f1af4 llvm::Value::getContext() const (/usr/local/bin/clang+++0x2b62af4) #5 0x00005599ec68275c llvm::StoreInst::StoreInst(llvm::Value*, llvm::Value*, bool, llvm::Align, llvm::AtomicOrdering, unsigned char, llvm::Instruction*) (/usr/local/bin/clang+++0x2af375c) #6 0x00005599ec6826cf llvm::StoreInst::StoreInst(llvm::Value*, llvm::Value*, bool, llvm::Align, llvm::Instruction*) (/usr/local/bin/clang+++0x2af36cf) #7 0x00005599eaacd3be llvm::IRBuilderBase::CreateAlignedStore(llvm::Value*, llvm::Value*, llvm::MaybeAlign, bool) (/usr/local/bin/clang+++0xf3e3be) #8 0x00005599ece029d5 clang::CodeGen::CodeGenFunction::EmitReturnStmt(clang::ReturnStmt const&) (/usr/local/bin/clang+++0x32739d5) #9 0x00005599ece0b360 clang::CodeGen::CodeGenFunction::EmitCompoundStmtWithoutScope(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/usr/local/bin/clang+++0x327c360) #10 0x00005599ece65d06 clang::CodeGen::CodeGenFunction::GenerateCode(clang::GlobalDecl, llvm::Function*, clang::CodeGen::CGFunctionInfo const&) (/usr/local/bin/clang+++0x32d6d06) #11 0x00005599ece889af clang::CodeGen::CodeGenModule::EmitGlobalFunctionDefinition(clang::GlobalDecl, llvm::GlobalValue*) (/usr/local/bin/clang+++0x32f99af) #12 0x00005599ece817a1 clang::CodeGen::CodeGenModule::EmitGlobalDefinition(clang::GlobalDecl, llvm::GlobalValue*) (/usr/local/bin/clang+++0x32f27a1) #13 0x00005599ece74dc3 clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang+++0x32e5dc3) #14 0x00005599ece74ddf clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang+++0x32e5ddf) #15 0x00005599ece74ddf clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang+++0x32e5ddf) #16 0x00005599ece74ddf clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang+++0x32e5ddf) #17 0x00005599ece74ddf clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang+++0x32e5ddf) #18 0x00005599ece74ddf clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang+++0x32e5ddf) #19 0x00005599ece74ddf clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang+++0x32e5ddf) #20 0x00005599ece74ddf clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang+++0x32e5ddf) #21 0x00005599ece74ddf clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang+++0x32e5ddf) #22 0x00005599ece74ddf clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang+++0x32e5ddf) #23 0x00005599ece74ddf clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang+++0x32e5ddf) #24 0x00005599ece74ddf clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang+++0x32e5ddf) #25 0x00005599ece74ddf clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang+++0x32e5ddf) #26 0x00005599ece74ddf clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang+++0x32e5ddf) #27 0x00005599ece74ddf clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang+++0x32e5ddf) #28 0x00005599ece74ddf clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang+++0x32e5ddf) #29 0x00005599ece74ddf clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang+++0x32e5ddf) #30 0x00005599ece74ddf clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang+++0x32e5ddf) #31 0x00005599ece7259b clang::CodeGen::CodeGenModule::Release() (/usr/local/bin/clang+++0x32e359b) #32 0x00005599ed25fc18 (anonymous namespace)::CodeGeneratorImpl::HandleTranslationUnit(clang::ASTContext&) ModuleBuilder.cpp:0:0 #33 0x00005599ed25700b clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (/usr/local/bin/clang+++0x36c800b) #34 0x00005599ed53f7bc clang::MultiplexConsumer::HandleTranslationUnit(clang::ASTContext&) (/usr/local/bin/clang+++0x39b07bc) #35 0x00005599eed04549 clang::ParseAST(clang::Sema&, bool, bool) (/usr/local/bin/clang+++0x5175549) #36 0x00005599ed4fbf14 clang::FrontendAction::Execute() (/usr/local/bin/clang+++0x396cf14) #37 0x00005599ed464854 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/usr/local/bin/clang+++0x38d5854) #38 0x00005599ed5c8d50 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/usr/local/bin/clang+++0x3a39d50) #39 0x00005599eaa81f21 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/usr/local/bin/clang+++0xef2f21) #40 0x00005599eaa7f1d0 ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) driver.cpp:0:0 #41 0x00005599ed2a6b29 void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const::$_0>(long) Job.cpp:0:0 #42 0x00005599ecb231ec llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) (/usr/local/bin/clang+++0x2f941ec) #43 0x00005599ed2a64c3 clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const (/usr/local/bin/clang+++0x37174c3) #44 0x00005599ed2646a2 clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&, bool) const (/usr/local/bin/clang+++0x36d56a2) #45 0x00005599ed2648fe clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&, bool) const (/usr/local/bin/clang+++0x36d58fe) #46 0x00005599ed28402f clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&) (/usr/local/bin/clang+++0x36f502f) #47 0x00005599eaa7e857 clang_main(int, char**, llvm::ToolContext const&) (/usr/local/bin/clang+++0xeef857) #48 0x00005599eaa8d486 main (/usr/local/bin/clang+++0xefe486) #49 0x00007faf8d643cd0 (/usr/lib/libc.so.6+0x29cd0) #50 0x00007faf8d643d8a __libc_start_main (/usr/lib/libc.so.6+0x29d8a) #51 0x00005599eaa7cc75 _start (/usr/local/bin/clang+++0xeedc75)
The file looks like this:
module; export module app; import lime.Game; export int app_main() { lime::Game game = {}; return 0; }
If I use
lime::Game *game = new lime::Game{};
the crash doesn't occur.
Thanks for testing this. Could you try to give a reproduced example for this crash? Then I can only actually fix it.
BTW, what's the overall end-to-end improvement this change brings to your project?
BTW, what's the overall end-to-end improvement this change brings to your project?
Using -fexperimental-modules-reduced-bmi
reduced the compile time of a single file from 2.94s to 2.45s. The WriteAST reduced from 1.2s to 0.87s. The PCM file size decreased from 20 MB to 16.3 MB. The reduction is similar to other module files. A large amount of time is spent on the frontend with std headers, which could be reduced with std modules.
@rnikander Maybe you can provide some benchmarks, as you had some files that had far worse compile times.
Thanks for testing this. Could you try to give a reproduced example for this crash? Then I can only actually fix it.
I could reduce the lime.Game module to this:
module;
export module lime.Game;
import lime.Render;
import entt;
export namespace lime {
class Game {
private:
entt::registry m_registry;
};
}// namespace lime
However lime.Render
and entt
are quite large so a minimal reproducer would take some time to create.
I just tested std modules with reduced bmi. Here is the trace without std modules: And here with std modules: This looks like the performance I expected from modules. As I no longer see an exponential compile time increase the issue is solved for me. The crash also doesn't occur if I link with std modules.
However lime.Render and entt are quite large so a minimal reproducer would take some time to create.
Thanks. It will be pretty helpful to improve the quality of the implementation.
Just wanted to add another data point here after my first attempt at using modules. Briefly, I've modularized a library which is predominantly headers, and template-heavy. My initial approach has been to transform headers into interface partitions pretty much 1:1, with just a small number of distinct top-level modules. At this point I'm not using import std
, so I get that it's far from ideal as there is heavy duplication of STL includes in GMFs.
I'm also seeing some very slow compilation times (for the partition TUs themselves, the few implementation units I have, and most of all for downstream consuming TUs), and it is heavily weighted to WriteAST
as mentioned above.
The one thing I've noticed and am wondering if is on anyone's radar is memory consumption. From some rough eyeballing of processes, TUs for which clang++ caps out at around 400MB RAM usage in non-modular builds, are pushing up around 3GB in the modules build. One particular TU goes up to 6GB. I believe this is exacerbating the time spent in WriteAST
in the case of parallel builds due to memory paging - occasionally I've seen really absurd times (10s of minutes, all falling within WriteAST
) when the memory contention has been particularly bad.
Note that -fexperimental-modules-reduced-bmi
definitely mitigates this - they're generally capping out around 1.8GB (vs 3GB), and the absurdly long compiles disappear. However this is still a number of times higher memory usage than without modules, and compilation times, while better than with the fat BMI, are still problematic.
With using modules, it is expected some more memory usages. But if it is a problem depends on the numbers. So my comment in the other issue you opened, the key idea here is to reduce the redeclarations. So it is pretty important to import third party libraries by import std
instead of including them everywhere. And -fexperimental-modules-reduced-bmi
will be helpful too.
We can find the reproducer at: https://discourse.llvm.org/t/modules-increased-build-times/68755/7
For short:
will take 3~4s to generate the PCM.
And
will take about 0.5s to complete only.
Note that:
will take 0.5s to complete too.
Currently I located in
ASTWriter::WriteDecl
. But I am not sure if this can be fixed simply.