Closed daljit46 closed 4 months ago
core/fixel/helpers.h
defined INLINE
that are not at all performance-related. Or maybe eg. interpolators could be explicitly instantiated for the full set of possible input image / adaptor types.tckmap
has already had one go at compile time reduction by reducing the number of possible template data types (#252). But I don't think that a similar approach is applicable to other commands, as it's not the case that an unnecessary large number of unique templates is being instantiated. More likely time is to be gained by preventing repeated efforts across objects.We definitely don't want to be reducing the actual utilisation of templates. The computation speed afforded by their use outweighs the detriment to compilation performance.
I agree that if performance concerns justify it, then templates can be a very good choice (additionally they also provide compile time safety which is always more desirable than runtime safety). However, I would say that the majority of code in any project is not performance-sensitive. Often templates can be a huge pain not just because of build times, but also because of readability, maintainability, binary size and error messages. BTW, my concern here was more with reducing the number of template instantiations rather than the use of templates themselves (which I guess is one way of doing that).
If there's candidate files currently implemented as header-only that could be precompiled without loss of execution performance then I'm all for it. Eg. I know from work elsewhere that there's all sorts of functions in core/fixel/helpers.h defined INLINE that are not at all performance-related. Or maybe eg. interpolators could be explicitly instantiated for the full set of possible input image / adaptor types.
I'm not sure if we have enough "stable" code to justify this. However, an alternative idea I used in #2877 is precompiled headers, which is similar in spirit.
This can be closed now that #2877 has been merged, improving the situation quite a bit. Possible improvements may be obtained (especially for incremental builds) by cleaning up unnecessary header includes, but that's a somewhat tangential issue and something I hope to try at some point in the future.
For reference, ClangBuildAnalyzer analysis now shows significantly improved parsing (~55% lower) and codegen (~20% lower) times:
**** Time summary:
Compilation (787 times):
Parsing (frontend): 1798.5 s
Codegen & opts (backend): 2184.3 s
MRtrix3 can take a long time to build. I often like to use a Windows laptop (i5 10th generation quad-core processor) to test changes on MSYS2 and compile times are in the 20-25-minute ballpark. The situation is much better on a Macbook M2 PRO, where I can compile the project in just under 2 minutes. Nonetheless, I think it would be worth to trying to improve the compilation time.
Possibly the most taxing factors on build times are:
To examine this more carefully, I carried out a build analysis using ClangBuildAnalyzer and here's the output:
We can see that:
mrregister
andmrtransform
take a long time to compile.I think it would be worth discussing what can be done to improve the situation (e.g. precompile header files, splitting .cpp files, reduce templated code, explicit instantiation of templates, etc...).