Open krasznaa opened 4 years ago
I'm aware of this. This is mostly caused by dictionary dependencies. I have a prototype that fixes this; I need to invest some dev time to get this into PR quality. I.e. thanks for the the report, problem acknowledged!
What can be done here is rather simple. The bottleneck last time I checked is rootcling (dictionary generation). There are two reasons:
Y.pcm
once we are done with building X.pcm
we wait for the linker to link X.so
.This is mostly caused by dictionary dependencies. I have a prototype that fixes this; I need to invest some dev time to get this into PR quality.
Moved on, giving up on this - here's what I ended up with last time I looked at it. I added some comments to explain what's happening.
(It also fixes the "changed a header included by a header that's passed to rootcling" transitional dependency issue...)
I personally do not think that the runtime of rootcling is the problem here, but rather the dependency tree. Of course making anything faster is good.
Explain what you would like to see improved
I know that this is a very first world problem, but it has been bugging me since a while. The build of ROOT using its CMake setup is not scaling well to many core systems at all. :frowning:
This is a snapshot of how ROOT 6.20/08 used my system's resources during its build:
The build starts "pretty much" at the left hand side of the timeline, and lasts until "pretty much" the right hand side of it.
As you can see, the build starts out very well. Building LLVM scales perfectly to 64 threads. And I believe it would scale well to even beyond that. But once the LLVM build is done, many bottlenecks show up. First there is a big bottleneck with building
libCling
androotcling
, but after that the build oflibRIO
is also taking a surprising amount of time. And the build is stuck waiting for all of these.Towards the end things improve a bit once more, as many libraries / source files can build in parallel once more. But even then, very rarely does the build manage to make use of all of the available cores.
Optional: share how it could be improved
From a quick glance it seems that ROOT's CMake configuration sets up way too many unnecessary dependencies between its build targets. Most of the issues seem to arise from how the dictionary generation is set up as far as I can see.
In ATLAS I use the following code to set up the generation of dictionary source files:
https://gitlab.cern.ch/atlas/atlasexternals/-/blob/master/Build/AtlasCMake/modules/AtlasDictionaryFunctions.cmake
And that provides a much better behaviour. Mainly because in ATLAS's setup dictionary generations do not need to wait for anything. Even if the library that a dictionary is being produced for depends on a number of upstream libraries, the dictionary for that library can be generated before all the upstream libraries would have finished building. In practice this actually means that the start of any ATLAS software build is dominated by running dictionary generation. As GNU Make and Ninja both prefer running those build steps first. (As they do not have any dependencies themselves.)
The reason I blame the dictionary generation code is that regular C(++) code building with Ninja scales very well to many cores. Even when one has many small libraries in a project, Ninja can start the build of object files before all of the libraries that they depend on would've finished building. (In ATLAS's offline software the very end of a build is taken up purely by library/executable linking steps.)
To Reproduce
Unfortunately you need a pretty powerful machine to do so... But once you do, just do something similar to what I did:
Setup
As mentioned earlier, I used ROOT 6.20/08 for this particular test. But the behaviour has been like this since forever. I performed the build on Ubuntu 20.04 with GCC 9, but that should make little difference to the overall behaviour.
Additional context
N/A