Open mwaxmonsky opened 1 week ago
Greetings all, apologies for throwing this out there without an issue but I've been trying to keep up with the changes in the framework and the equivalent tests and was getting lost in the details and noticed a few things that I thought would make things a little bit easier overall.
My main goal was making it more explicit what was being tested and reducing the amount of duplication we have but that necessitated quite a few changes at various levels of the infrastructure.
I started going down a rabbit hole of how the framework is built and tested but before going much further, I figured I would share some of the things I've updated and check in to see if these make sense and are useful for the rest of the team.
There are a few technical details that need to be evaluated as well but I wanted to check if, overall, this is moving things in a reasonable direction or if this is making things more complicated?
Hi @mwaxmonsky, thanks for tackling this.
So far, most of this looks like much needed cleanup!
There is one change I do not understand. The old (clumsy, wordy) run_test
script tested both the Python and the bash interfaces to ccpp_datafile.py. Do the new tests test the bash interface? These are (or at least were) used by some build systems.
Hi @mwaxmonsky, thanks for tackling this.
So far, most of this looks like much needed cleanup!
There is one change I do not understand. The old (clumsy, wordy)
run_test
script tested both the Python and the bash interfaces to ccpp_datafile.py. Do the new tests test the bash interface? These are (or at least were) used by some build systems.
@gold2718 Yup! I made sure to be as faithful to the current coverage and made each of the shell tests their own unit tests at the bottom of the python file using the subprocess.run(...)
API. If I missed anything on that front, I'm happy to add and address anything that might be missing.
There is a lot of great stuff in this PR, thanks @mwaxmonsky.
My main concern with this PR is that it touches code/build files that aren't just used by the unit tests, but also by production code (e.g. the top-level
CMakeLists.txt
file). For instance, this PR enfores-O0
and certain compiler flags for the CCPP framework. We definitely do not want-O0
for production environments. There must be different options to set compilers for unit testing and for host models like the UFS, CAM/SIMA, NEPTUNE, and there must be a way for the host model to define compiler flags that overwrite any defaults (if we want to set defaults for cases other than running the tests).There are two comments below that need fixing.
We also need to consider targeting the
main
branch for this PR due to the impact on the host modeling systems. This PR may have to come straight after the currentdevelop
was merged intomain
. And it must be written in a way that it allows ccpp-prebuild to continue to work, otherwise this PR (or the develop branch, if this PR gets merged into develop) will be blocked until all host models have moved to capgen. We don't want this, it was a heavy lift last time to getfeature/capgen
being merged intomain
after a long time of parallel development.
@climbfuji Happy to remove the -O0
flag and the other flags (I meant to make those target specific but wanted to get feed back on everything else before going further with the build), that was just in there from debugging and we definitely don't want that in there for production environments.
In terms of options from different compilers/flags, I'll look into different APIs for cmake and see what makes the most sense.
As for targeting main, I'm happy to pivot there but the tests for prebuild run successfully the same as on main. Is the issue that users would have to update their model code and their build integration with the framework? Just trying to understand the issue of targeting develop instead of main.
There is a lot of great stuff in this PR, thanks @mwaxmonsky. My main concern with this PR is that it touches code/build files that aren't just used by the unit tests, but also by production code (e.g. the top-level
CMakeLists.txt
file). For instance, this PR enfores-O0
and certain compiler flags for the CCPP framework. We definitely do not want-O0
for production environments. There must be different options to set compilers for unit testing and for host models like the UFS, CAM/SIMA, NEPTUNE, and there must be a way for the host model to define compiler flags that overwrite any defaults (if we want to set defaults for cases other than running the tests). There are two comments below that need fixing. We also need to consider targeting themain
branch for this PR due to the impact on the host modeling systems. This PR may have to come straight after the currentdevelop
was merged intomain
. And it must be written in a way that it allows ccpp-prebuild to continue to work, otherwise this PR (or the develop branch, if this PR gets merged into develop) will be blocked until all host models have moved to capgen. We don't want this, it was a heavy lift last time to getfeature/capgen
being merged intomain
after a long time of parallel development.@climbfuji Happy to remove the
-O0
flag and the other flags (I meant to make those target specific but wanted to get feed back on everything else before going further with the build), that was just in there from debugging and we definitely don't want that in there for production environments.In terms of options from different compilers/flags, I'll look into different APIs for cmake and see what makes the most sense.
As for targeting main, I'm happy to pivot there but the tests for prebuild run successfully the same as on main. Is the issue that users would have to update their model code and their build integration with the framework? Just trying to understand the issue of targeting develop instead of main.
Yes, if you change the top-level CMakeLists.txt, you will impact the host models that use the ccpp-framework. UFS, SCM and NEPTUNE all use the ccpp-framework CMakeLists.txt.
Also, this might be better saved for a discussion thread but I was wondering about the references to MPI and OpenMP. I don't see any openmp pragmas or types and currently the only reference to MPI is the mpi_comm in ccpp_types
and even then we only declare a public reference but never use it internally.
I see a note in common.py
about MPI:
# Maximum number of concurrent CCPP instances per MPI task
CCPP_NUM_INSTANCES = 200
and this value is used in the write function in mkstatic but it doesn't look like this is directly tied to the framework being build with MPI support.
If we don't need MPI or openmp at build time for the framework, would it make sense removing these references in the CMake and ccpp_types or are these actively being used and I'm just missing a key detail?
Currently, capgen generates omp_get_thread_num
calls in the suite caps.
I don't think there is any MPI code generated currently by capgen.
ccpp-prebuild doesn't use any OpenMP calls in the auto-generated caps (everything comes from the host model via ccpp_t
and its components, but the auto-generated caps and the code in src/ccpp_types.F90
need MPI. Removing the MPI dependency may break host models currently using ccpp_prebuild.
Refactoring of testing infrastructure
Changes include:
unittest
framework.User interface changes?: No
Fixes: None
Testing: test removed: None unit tests: system tests: manual testing: