HPSCTerrSys / TSMP2

CMake-based TerrSysMP
https://github.com/HPSCTerrSys/TSMP
MIT License
1 stars 2 forks source link

Frontend for TSMP2 #24

Closed s-poll closed 1 month ago

s-poll commented 2 months ago

This PR introduces a top-level script that makes building TSMP2 models simple and easy to understand and enhances the cmake script with additional options.

The main changes include:

Smaller changes:

The defaults for building are defined in CmakeLists.txt with the exception of the source and build directory.

Why was the top-level shell script chosen?

@jjokella : Even though, if the changes should not affect you, would be great if you could have a check, if this effects pdaf.

jjokella commented 2 months ago

Thanks @s-poll , this PR looks great!

I had a slight confusion with the ouptut in the log file:

== TSMP2 settings ==                                                                                                                                                                           
====================                                                                                                                                                                           
MODEL-ID: eCLM-PDAF                                                                                                                                                                            
TSMP2DIR: /p/scratch/cjibg36/keller5/DA/TSMP-PDAF/TSMP2-frontend                                                                                                                               
BUILDDIR: /p/scratch/cjibg36/keller5/DA/TSMP-PDAF/TSMP2-frontend/bld/JURECADC_eCLM-PDAF                                                                                                        
INSTALLDIR: /p/scratch/cjibg36/keller5/DA/TSMP-PDAF/TSMP2-frontend/bin/JURECADC_eCLM-PDAF                                                                                                      
CMAKE command:                                                                                                                                                                                 
cmake -S /p/scratch/cjibg36/keller5/DA/TSMP-PDAF/TSMP2-frontend -B /p/scratch/cjibg36/keller5/DA/TSMP-PDAF/TSMP2-frontend/bld/JURECADC_eCLM-PDAF    -DeCLM=ON -DPDAF=ON      -DCMAKE_INSTALL_PR
EFIX=/p/scratch/cjibg36/keller5/DA/TSMP-PDAF/TSMP2-frontend/bin/JURECADC_eCLM-PDAF |& tee /p/scratch/cjibg36/keller5/DA/TSMP-PDAF/TSMP2-frontend/bld/eCLM-PDAF_2024-08-30_17-52.log            eCLMSRC: /p/scratch/cjibg36/keller5/DA/TSMP-PDAF/TSMP2-frontend/models/eCLM                                                                                                                    
MODEL_ID: eCLM                                                                                                                                                                                 
MODELCOUNT: 1   

In particular the MODEL-ID and MODEL_ID output. I think the first comes from build_tsmp2.sh, the second then from CMake (could not find exact point). Maybe it can be made clearer, which outputs are from the shell-wrapper, in particular where the output from the shell wrapper ends.

jjokella commented 2 months ago

Regarding PDAF, I was up to now using PDAF_SRC as the check if PDAF is activated or not.

cmake/BuildeCLM.cmake:13:if(DEFINED PDAF_SRC)
cmake/BuildeCLM.cmake:43:if(DEFINED PDAF_SRC)
cmake/BuildOASIS3MCT.cmake:37:if(DEFINED PDAF_SRC)
cmake/BuildCLM3.5.cmake:8:if(DEFINED PDAF_SRC)

I guess, I should replace if(DEFINED PDAF_SRC) by if(${PDAF})?

The existing code now breaks for eCLM-PDAF (at least of PDAF_SRC is not explicitly set), because PDAF_SRC is not set anymore during compilation of eCLM.

With this change implemented, I get a NetCDF-error in the very last step:

ld: /p/software/jurecadc/stages/2023/software/netCDF-Fortran/4.6.0-ipsmpi-2022a/lib/libnetcdff.so: undefined reference to `nc_def_var_zstandard'
ld: /p/software/jurecadc/stages/2023/software/netCDF-Fortran/4.6.0-ipsmpi-2022a/lib/libnetcdff.so: undefined reference to `nc_def_var_quantize'
ld: /p/software/jurecadc/stages/2023/software/netCDF-Fortran/4.6.0-ipsmpi-2022a/lib/libnetcdff.so: undefined reference to `nc_inq_var_quantize'
ld: /p/software/jurecadc/stages/2023/software/netCDF-Fortran/4.6.0-ipsmpi-2022a/lib/libnetcdff.so: undefined reference to `nc_inq_var_zstandard'
make[3]: *** [Makefile:144: /p/scratch/cjibg36/keller5/DA/TSMP-PDAF/TSMP2-frontend/bin/JURECADC_eCLM-PDAF/bin/tsmp-pdaf] Error 1
gmake[2]: *** [CMakeFiles/PDAF-Framework.dir/build.make:86: PDAF-Framework/src/PDAF-Framework-stamp/PDAF-Framework-build] Error 2
gmake[1]: *** [CMakeFiles/Makefile2:167: CMakeFiles/PDAF-Framework.dir/all] Error 2
gmake: *** [Makefile:91: all] Error 2

With that, I have to leave it for now. Maybe you know, what could be changed to the CMake-commands that could lead to NetCDF-issue like that.

kvrigor commented 2 months ago

Both ./build_tsmp2.sh --eCLM --ParFlow --ICON and ./build_tsmp2.sh --eCLM --PDAF works in JURECA 🎉

@jjokella it'd be nice if you could also test again, at least for eCLM-PDAF.. also if you've observed more issues kindly raise them here

I think we can merge Stefan's PR once ./build_tsmp2.sh has been verified working and all the open issues above are resolved.

s-poll commented 1 month ago

I would like to merge the PR. Are there any objections?

In my test all combinations worked (have not checked COSMO,CLM3.5).

In the testing, however, I found two weaknesses, which might be tackled in future. 1) The bld directory is not deleted to ensure the workflow for small code modifications. For a clean building one should delete bld/SYSTEMNAME_MODEL-ID before building. In future one could set build_opt to indicate a clean build (default?), or also just parts (project, build, install). Maybe also include rebuilding specific components. 2) It seems that some of the building procedures are modifying files within the original model source code (especially with PDAF). As far as I have seen this is not connected to the shell-based frontend. I would suggest that this is discussed in an issue. For now, I included the --force option when user choose to overwrite the model component.

kvrigor commented 1 month ago

Looks good @s-poll ! I've also re-tested the latest changes and the build script works.

jjokella commented 1 month ago

Hi @s-poll , @kvrigor

eCLM-PDAF also builds now for me using the simple ./build_tsmp2.sh --eCLM --PDAF. Therefore, I think this PR can be merged for now.

Regarding PDAF and in-repo changes. This is true. For a start I have integrated PDAF into TSMP2 by adding the make-command inside the PDAF repo: https://github.com/HPSCTerrSys/TSMP2/blob/7ea08d8aa0569e28bb49ea64acc1a4ca475c22b1/cmake/BuildPDAF.cmake#L156-L165

From TSMP2-side, a better CMake-integration would be nice. From PDAF side this includes some more work.

See also https://github.com/HPSCTerrSys/pdaf/blob/tsmp-pdaf-patched/make.arch/cmake.h for the Makefile header file that reads the variables set in TSMP2's CMake files.