Open loumalouomega opened 2 years ago
Ping @roigcarlo
HI @loumalouomega
We acknowledge this is a problem but still we are unsure how yo tackle it. All the solutions you propose are effective towards reducing the compilation time, but not all of them can be applied freely in Kratos right now.
This ( in my opinion ) is where we can get more improvement, but we cannot blindly start to move all classes to cpp because we have some inter-application dependencies ( specially at template level ) where some compilation units would require to have definition of function instantiated for classes that do not exist in the moment of compilation. Example:
Compilation unit 1
// Header
class MyClass<class TParam> {
TParam myTemplatedFunction();
}
// Source
template <T> T MyClass::myTemplatedFunction() { return T() }
// Source
Compilation unit 2 ( that has a linking dependency on CU1 but does not have access to "MyClass" cpp )
// Header ( or source, problem is the same )
#include "MyClass.h"
struct Break {};
class MySecondClass {
Break ErrorFunction() { return MyClass<Break>().myTemplatedFunction()}
}
In this scenario is impossible to use MyClass
outside its compilation unit, and its compilation unit is not enough to fully define without including cyclic dependencies.
In general in Kratos this is the case of some Element
instances.
In conclusion we have to be very careful with this kind of structures so it does not break anything (or worst, makes it unusable if at some point someone needs that)
I recall trying this the first time we used unitary builds (with cotire) and the effect was minimal, and we always ended up having to recompile the PCH's every time a change was made, probably because a big chunk of our code is in headers. If we manage to correctly port it to cpp, no problems against it.
Thumbs up for this, as long as you manage to identify the redundancies and resolve the dependencies of files that incorrectly include a "hub" include because they don't know what they need.
HI @loumalouomega
We acknowledge this is a problem but still we are unsure how yo tackle it. All the solutions you propose are effective towards reducing the compilation time, but not all of them can be applied freely in Kratos right now.
Moving to cpp:
This ( in my opinion ) is where we can get more improvement, but we cannot blindly start to move all classes to cpp because we have some inter-application dependencies ( specially at template level ) where some compilation units would require to have definition of function instantiated for classes that do not exist in the moment of compilation. Example:
Compilation unit 1
// Header class MyClass<class TParam> { TParam myTemplatedFunction(); } // Source template <T> T MyClass::myTemplatedFunction() { return T() } // Source
Compilation unit 2 ( that has a linking dependency on CU1 but does not have access to "MyClass" cpp )
// Header ( or source, problem is the same ) #include "MyClass.h" struct Break {}; class MySecondClass { Break ErrorFunction() { return MyClass<Break>().myTemplatedFunction()} }
In this scenario is impossible to use
MyClass
outside its compilation unit, and its compilation unit is not enough to fully define without including cyclic dependencies.In general in Kratos this is the case of some
Element
instances. In conclusion we have to be very careful with this kind of structures so it does not break anything (or worst, makes it unusable if at some point someone needs that)
Yes I know that some templated classes like geometries cannot be moved to cpp, because the idea is to be ablke to use it with any arbitrary type of point.
Moving to cpp:
We could do something similar to forward declarations. It may be a bit overly convoluted, but we could organize templates like this:
=> by default, we'd only include the declarations. => when we need to instantiate the template with a specific type, we create a source file that includes both the declaration and the implementation, then define an explicit specialization for the desired type. This would make the template available for explicitly specialized types in all translation units (linked ones too).
Here's your example rewritten:
my_class.h
template <class TParam>
class MyClass {
TParam myTemplatedFunction();
};
my_class_impl.h
template <class TParam>
TParam MyClass<TParam>::myTemplatedFunction() {
// implementation goes here
}
Now let's say we want to use MyClass<int>
somewhere:
some_application.h
#include "my_class.h"
void doSomethingWithMyInt(MyClass<int> intInstance);
some_application.cpp
void doSomethingWithMyInt(MyClass<int> intInstance) {
intInstance.myTemplatedFunction();
}
At this point, using some_application.cpp
would result in a linker error because the compiler can't find the definition of MyClass<int>::myTemplatedFunction
since my_class_impl.hpp
is not included. However, we don't want to include my_class_impl.hpp
because it's too bulky, so we create a separate source file instead where we declare an explicit specialization with int
.
my_class_int.cpp
#include "my_class.h"
#include "my_class_impl.h"
template <> MyClass<int>;
This will force the compiler to generate the implementation for MyClass<int>
, and it will become available in other translation units as well. We'll still get linker errors if we try using MyClass
with other template parameters, but we could just do the same thing for every type we want to use MyClass
with.
See the implementation in #10052 for another example. The only difference there is that the implementation is in the source file instead of a separate implementation header, because I knew all the types these function templates will be used with in advance.
This trick comes in very handy if there's only a handful of types we want to use our templates with. It's obviously not useful for truly generic templates.
Dear @loumalouomega ,
we were discussing this (very important) topic within the @KratosMultiphysics/technical-committee
Aside for the concern expressed by @roigcarlo, moving stuff away from the headers may have important performance implications as it is "easier" to inline stuff in header than stuff in cpps.
The c++ forward looking solution to this would be "modules" (yet not widely supported)
see e.g.
https://docs.microsoft.com/en-us/cpp/cpp/modules-cpp?view=msvc-170
our impression is that it is NOT wise to start a wide movement of migrating stuff to cpps but rather to
for these reasons we are hesitant about changes in foundational classes (for example the data bases). Rather, we would look into changing classes that are not foundational, or even better, on cleaning up the header dependencies.
@RiccardoRossi Inlining is important, but I see a lot of implementations that don't really justify it. There are lots of class definitions that are fully implemented in headers even though they only have one or two inline
-worthy functions (just take a look at almost any process. For example, I guarantee that the constructors of ApplyConstantVectorValueProcess will never be inlined by the compiler because they're just "too long").
My hunch is that people sometimes don't want to create a separate source file because of all the boilerplate involved, so they just define everything in headers.
@RiccardoRossi Inlining is important, but I see a lot of implementations that don't really justify it. There are lots of class definitions that are fully implemented in headers even though they only have one or two
inline
-worthy functions (just take a look at almost any process. For example, I guarantee that the constructors of ApplyConstantVectorValueProcess will never be inlined by the compiler because they're just too "long").My hunch is that people sometimes don't want to create a separate source file because of all the boilerplate involved, so they just define everything in headers.
That's my impression too
@matekelemen it may be true that boilerplate has a role, nevertheless there are open PRs about chaning foundational (and performance sensitive) parts of the code...we are NOT positive about those...
if you really want to start a push in this direction, we suggest to start from other, less foundational, parts of the code, or by looking into the included includes
3. Also splitting in cpp and header implies essentially losing the repo history, something we don't really love
I understand what you say, but precisely these classes are so important that are heavily included and the penalty is not insignificant. I think we must weight what do we prefer, and in certain cases the winner will be the compilation time when this is significative.
The other two points of course, but the PR you are referring is not possible to refactor inclussions, because is not very much included, but is included in the node.h, being indirectly included in many places.
One different approach would be to remove the template from node.h as I mentioned in other issue and instantiate the node, so it will be precompiled, and the weight of that header will be not so problematic.
@matekelemen it may be true that boilerplate has a role, nevertheless there are open PRs about chaning foundational (and performance sensitive) parts of the code...we are NOT positive about those...
if you really want to start a push in this direction, we suggest to start from other, less foundational, parts of the code, or by looking into the included includes
See my former comment
Also splitting in cpp and header implies essentially losing the repo history.
Yeah that's a tricky and unfortunate problem. That said, the Kratos repo and its history is getting rather bulky (it's in the multi-gigabyte range now), so we'll have to start thinking about addressing that problem too, sooner or later.
We should probably sync the solution of these two issues.
my_class_impl.h
template <class TParam>
TParam MyClass<TParam>::myTemplatedFunction() {
// implementation goes here
}
@matekelemen This is a very interesting and complicated case:
By having a include that effectively acts a source, you are forcing different transaction units to have the same definition of the same class. This is incorrect. If we want to force this ( and we use this in Kratos, for example for the <<
operators ) you should include a inline
keyword to tell the compiler that it's ok to have the same definition coming from multiple sources.
Probably a good moment now that inlines came into the discussion, to remember that the inline
keyword is not just a hint for the compiler to know when it can inline or not thins, it also manipulates where and how definitions of functions are allowed. In our case is important because:
An inline function or variable (since C++17) with external linkage (e.g. not declared static) has the following additional properties:
There may be more than one definition of an inline function in the program as long as each definition appears in a different translation unit and (for non-static inline functions) all definitions are identical. For example, an inline function may be defined in a header file that is included in multiple source files.
It must be declared inline in every translation unit. It has the same address in every translation unit.
The include
should not be a problem by its own, but explicitly splitting the header_header and the header_sources, makes it impossible to control if you are attaching the right implementation to the proper header. Actually its pretty easy to break this rule: for instance imagine I am tempted to wrongly specialize a class (intended or by mistake) in my compilation unit (call it application):
CU1
// Header "MyTemplatedClass.h"
teamplate<class T1>
class MyTemplatedClass {
int printNumber();
}
// SourceHeader "MyTemplatedClassSource.h"
templte <class T1>
inline int MyTemplatedClass<T1>::printNumber {print 1};
// Source "MyTemplatedClass.cpp"
include "MyTemplatedClass.h"
include "MyTemplatedClassSource.h"
template class MyTemplatedClass<int>;
CU2
// SourceHeader "MyTemplatedClassSource2.h"
templte <class T1>
inline int MyTemplatedClass<T1>::printNumber {print 2};
// Source "MyTemplatedClass.cpp"
include "MyTemplatedClass.h"
// Note that I decided to include another definition here
// and nobody can do anything to stop me :)
include "MyTemplatedClassSource2.h"
template class MyTemplatedClass<double>;
CU2
// Random source file "Code.cpp"
include "MyTemplatedClass.h"
my = MyTemplatedClass<int>();
my.printNumber()
Both definitions are correct, you have told the compiler not to care about where the definition comes from, and you needed to include them both because you are now instantiating with another class. What does printNumber
prints? Its impossible to know because you have no control about what code is being used, moreover, the results can vary from one compilation to another.
Edit: I didn't remember to reference this but is not a compile error, its undefined:
If an inline function with external linkage is defined differently in different translation units, the behavior is undefined.
What I mean is, while technically the solution is possible, and despite my personal preferences, it opens the door to nasty problems that are not easy to detect.
@roigcarlo I should probably not have mentioned the other PR because the specializations there are different for each type, my bad.
I think your example would result in an ODR violation (leading to a compilation error), though I'm not sure. My idea is that there should be only one implementation header (MyTemplatedClassSource2.h
and the implementation it contains should not exist), and the specializations would not have individual implementations of their own.
I'll create a toy project we can play around with and link it here.
Here's a complete and working example of what I have in mind.
[Let me know if you'd like to play around with it and I'll give you collaborator access.]
Regardless of whether we're planning to do some refactoring, I think it's good to keep this possibility in mind for developing new stuff too.
Hi @matekelemen as I said, it will work.
I was also preparing a small toy example to show what I mean with being dangerous
Basically this is an extension of what you said, but divided in several libraries that have dependencies among them, and all expose what they see fit.
Note that the code is clearly wrong, but the compiler just doesn't even throw a warning. Even more dangerous, depending on the order of linkage:
# target_link_libraries(TestODR1__ PUBLIC unit1)
./TestODR1__
Int: 1
Double: 1
# target_link_libraries(TestODR12_ PUBLIC unit1 unit2)
./TestODR12_
Int: 1
Double: 2
# target_link_libraries(TestODR12_ PUBLIC unit1 unit3)
./TestODR1_3
Int: 1
Double: 3
# target_link_libraries(TestODR123 PUBLIC unit1 unit2 unit3)
./TestODR123
Int: 1
Double: 2
# target_link_libraries(TestODR123 PUBLIC unit1 unit3 unit2)
./TestODR132
Int: 1
Double: 3
Ah I see what you mean now. That's quite nasty indeed, plus it doesn't even require the inline
to compile; I'm surprised no compiler picks up on the ODR violation.
Description @KratosMultiphysics/technical-committee
One thing that makes the compilation slower is the use of many includes (in a C/C++ compiler an #include copy and pastes the whole file in the source code, so it may become huge even for a small quantity of code).
To avoid these problems usually you do the following two things:
These two options must go first, the first one a priori is taken into account (but I still find from time to time files not seaprected in source files, see https://github.com/KratosMultiphysics/Kratos/pull/10050). The second I think we should consider to take into consideration, right now we are not, and I think we can do it specially with the templated headers that we use intensively (like geometries).
In addition to these, there are tools that help you to reduce the number of includes considered (many times we copy paste code and we include not even used). For example there are tools like https://include-what-you-use.org/.
I know that Unitary builds partially solves the issue, but we should be able to compile relatively fast without doing tricks like these. At the end this should be the proper thing to do, and in any case doing this the Unitary builds will be even faster.