nextsimhub / nextsimdg

neXtSIM_DG : next generation sea-ice model with DG
https://nextsim-dg.readthedocs.io/en/latest/?badge=latest
Apache License 2.0
10 stars 13 forks source link

improve cmake build efficiency #347

Open TomMelt opened 1 year ago

TomMelt commented 1 year ago

Currently running the cmake build process (see below) takes around 30 mins using 4 processors on my Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz.

# build process
cmake ..
make -j 4

This is due to duplicated recompilation of the core source code.

image

My first attempt to fix this issue can be seen in branch issue347-improve-cmake-build-efficiency

So far this is a proof of concept. It does not currently work for all tests (see example of errors below):

/usr/bin/ld: CMakeFiles/testRectGrid.dir/RectGrid_test.cpp.o: warning: relocation against `_ZN7Nextsim10ModelArray6m_dimsE' in read-only section `.text._ZN7Nextsim10ModelArray10dimensionsENS0_4TypeE[_ZN7Nextsim10ModelArray10dimensionsENS0_4TypeE]'
/usr/bin/ld: CMakeFiles/testRectGrid.dir/RectGrid_test.cpp.o: in function `Nextsim::DOCTEST_ANON_FUNC_14()':                                                 
RectGrid_test.cpp:(.text+0x159f7): undefined reference to `Nextsim::NZLevels::set(unsigned long)'                                                            
/usr/bin/ld: RectGrid_test.cpp:(.text+0x15a82): undefined reference to `Nextsim::ModelArray::setDimensions(Nextsim::ModelArray::Type, std::vector<unsigned long, std::allocator<unsigned long> > const&)'

I have several ideas on how to proceed, but first @timspainNERSC @einola it would be great to get some understanding as to which kinds of grids you want to support and how that support should be managed? Also a similar concept with the ModelArrayStructure which is defined in the CMakeLists.txt. Should these be decided at run-time or compile-time.

timspainNERSC commented 1 year ago

Regarding the grids, all three of ParametricGrid.cpp, DevGrid.cpp and RectangularGrid.cpp are required to be compiled into the binary. The main model can read a restart file in any of these three formats and run the model on the data therein. Specifically, the class StructureFactory has calls to the constructors of all three of the grid classes.

Ultimately, we probably don't need to support DevGrid.cppand RectangularGrid.cpp, since they are both simplified grids compared to what anyone will want to use. However, we will likely want to support other grid types in the future.

The details in ModelArrayStructure are decided at compile time, and relate to the dynamics. Different implementations of the dynamics may require different types of arrays, and these are defined in the ModelArrayStructure files. Currently the implementations of the dynamics are Discontinuous Galerkin and None, but there are ModelArrayStructure files for both the model and physics tests and a simplified version for the core tests.

One possibility is compile different libraries for the different ModelArrayStructure sets, which would still be less compilation than compiling all the required objects for all the tests.

Another possibility would be to investigate whether ModelArrayStructure, specifically ModelArray::Type and ModelArray::Dimension, can be changed at run time, rather than compile time.

einola commented 1 year ago

Nice work @TomMelt! I don't have anything to add to what @timspainNERSC already said about grids etc.

I'm thinking about the linking of the library. We split some of the old neXtSIM into a library in a similar manner and there we link dynamically when we create the model executable. I would like to avoid this as it has caused us several minor problems. They're all solvable, but it's just a hassle.

In the current case I can see how dynamic linking could be preferred for the tests (although I'm not sure if it's a big difference) - but for the main executable, I would still want to link statically.

timspainNERSC commented 1 year ago

I decided to see if my internal knowledge of the code structure (yes, it needs to be written down somewhere) could help with the compilation errors. I ended up fixing them. Hopefully that's not a problem :)

With clang on my M1 Mac, everything now compiles using nextsimlib.

With the ModelArrayDetails class only being compiled once, this means that the differing headers used in the ParaGrid and RectGrid tests only works because the definitions in the finiteelement ModelArrayDetails are essentially a compatible subset of those in discontinuousgalerkin. Right?

timspainNERSC commented 1 year ago

Closing this issue will also close #210

TomMelt commented 1 year ago

@einola @timspainNERSC I started drafting a PR (#366) for my branch. But I have realised given the recent PR merges and resulting conflicts, it would be easier to just start again. I will make a new PR shortly with the updated changes.

My branch issue347-improve-cmake-build-efficiency gives a factor of 5x speed up. It now takes roughly 8mins instead of 40mins previously. (Running with 4 cores on my laptop compiles in 2.5 mins)

einola commented 1 year ago

Would rebasing issue347-improve-cmake-build-efficiency onto develop not be a better option than starting with a new PR?

timspainNERSC commented 1 year ago

And to stick my oar in, might I suggest using the branch aug23_mar3 as a (re)base? That's what (I hope) develop is eventually going to look like.

TomMelt commented 1 year ago

Would rebasing issue347-improve-cmake-build-efficiency onto develop not be a better option than starting with a new PR?

@einola Normally yes, but the last 2 merges were so big and complex that the structure of the code and the files have changed so much that the merge conflicts are harder to deal with than just starting again.

In this case because I already know how to do it, it doesn't take much time to re-implement. In fact, I have already done it -- I just need to finish testing it.

TomMelt commented 1 year ago

And to stick my oar in, might I suggest using the branch aug23_mar3 as a (re)base? That's what (I hope) develop is eventually going to look like.

@timspainNERSC Unless you are planning to merge aug23_mar3 soon I would prefer to branch of develop because I should be done by next week and ready to merge back. I assumed from the name of the branch however, it may not be merged so soon. Do you have a approximate timeframe for merging the aug23_mar3 branch?

(edit - added) Also these changes will be useful for my next set of changes from adding End-to-End (/integration) tests and also adding openMP in the near future

timspainNERSC commented 1 year ago

And to stick my oar in, might I suggest using the branch aug23_mar3 as a (re)base? That's what (I hope) develop is eventually going to look like.

@timspainNERSC Unless you are planning to merge aug23_mar3 soon I would prefer to branch of develop because I should be done by next week and ready to merge back. I assumed from the name of the branch however, it may not be merged so soon. Do you have a approximate timeframe for merging the aug23_mar3 branch?

The relevant timeline is: when all the necessary changes are reviewed!

But, no, that's a perfectly good reason.

einola commented 1 year ago

Would rebasing issue347-improve-cmake-build-efficiency onto develop not be a better option than starting with a new PR?

@einola Normally yes, but the last 2 merges were so big and complex that the structure of the code and the files have changed so much that the merge conflicts are harder to deal with than just starting again.

In this case because I already know how to do it, it doesn't take much time to re-implement. In fact, I have already done it -- I just need to finish testing it.

Ok, that sounds reasonable. Especially if it's easier to simply re-implement.