TriBITSPub / TriBITS

TriBITS: Tribal Build, Integrate, and Test System,
http://tribits.org
Other
36 stars 46 forks source link

Update MPI, HDF5, and Netcdf TriBITS find modules (#533, #534) #535

Closed bartlettroscoe closed 1 year ago

bartlettroscoe commented 1 year ago

Description

These commits address #533 and #534

These commits were pulled off of the branch from:

(which also matches the patches in https://github.com/sandialabs/seacas/pull/336).

This seems to maintain near perfect backwards compatibility.

Tasks

bartlettroscoe commented 1 year ago

Okay, given above, hopefully these new FindTPL[HDF5|Netcdf].cmake modules will work with newer releases of HDF5 and NetCDF as well.

I will merge this after the merge of:

We can always tweak this later if needed.

bartlettroscoe commented 1 year ago

With the merge of:

we can finally merge this PR.

bartlettroscoe commented 1 year ago

Hello @KyleFromKitware, could you please take a look at this PR. This is an example of where this updated TriBITS external package/TPL system provides the tools to fix up these external packages that are not quite following proper usage of CMake. If you focus on the new code added to these modules, you will see it is pretty minimal (and is mostly print statements).

And, actually, I just realized that I did not add a test for TriBITS core change 7fd57e5 (#535) to protect it going forward. Since there is no huge rush to merge this PR, I will do that when I get back to MD.

bartlettroscoe commented 1 year ago

I am looking over how to test 7fd57e5 (#535) and it will require quite a lot of work and a lot of code. That code in that file has not changed for years and is unlikely to ever change. (Even the large refactoring in #299 did not touch that code at all. Therefore, I have determined that the cost of adding a test for that change is larger than the value it will have. (This is not a decision I take likely.)

gsjaardema commented 1 year ago

NetCDF doesn't depend on CGNS. They both depend on HDF5, so I thought that maybe configuring CGNS first would then set the MODERN cmake for HDF5 which would then allow the if (Netcdf_ALLOW_MODERN AND HDF5_FOUND_MODERN_CONFIG_FILE) to work...

If netCDF is configured before CGNS, then that if won't be triggered because it is netCDF that triggers HDF5 if I am understanding things correctly (probably not)

bartlettroscoe commented 1 year ago

If netCDF is configured before CGNS, then that if won't be triggered because it is netCDF that triggers HDF5 if I am understanding things correctly (probably not)

@gsjaardema, the HDF5 TPL will always be configured before Netcdf TPL, because the TPLs are processed in the order they are listed in the TPLsList.cmake file which is the case with SEACAS as shown at:

https://github.com/sandialabs/seacas/blob/a9634da294d7d4bc64145fa33969de6be10d31b6/TPLsList.cmake#L60-L62

gsjaardema commented 1 year ago

This is a portion of the output of my CMake configuration:

Processing enabled external package/TPL: Netcdf (enabled explicitly, disable with -DTPL_ENABLE_Netcdf=OFF)
-- Using find_package(Netcdf ...) ...
--      NetCDF_ROOT is /Users/gdsjaar/src/seacas
--      netCDF_CONFIG is /Users/gdsjaar/src/seacas/lib/cmake/netCDF/netCDFConfig.cmake
-- NetCDF requires HDF5
-- NetCDF depends on HDF5
-- Updating NetCDF_LIBRARIES and NetCDF_INCLUDE_DIRS
-- Found HDF5 CMake configuration file ( directory /Users/gdsjaar/src/seacas/cmake/hdf5-config.cmake )
-- HDF5 Version: 1.10.9
--      HDF5_INCLUDE_DIRS      =/Users/gdsjaar/src/seacas/include
--      HDF5_LIBRARY_TARGETS   =hdf5_hl;hdf5
--      HDF5_LIBRARIES         =/Users/gdsjaar/src/seacas/lib/libhdf5_hl.dylib;/Users/gdsjaar/src/seacas/lib/libhdf5.dylib
--      HDF5_LIBRARIES_EXPORT  =/Users/gdsjaar/src/seacas/lib/libhdf5_hl.dylib;/Users/gdsjaar/src/seacas/lib/libhdf5.dylib
--      HDF5_LINK_LIBRARIES    =
--      HDF5_IS_PARALLEL       =NO
-- Found the following HDF5 component libraries
--      hdf5
--      hdf5_hl
--      HDF5 Components not found: static;shared;CXX;Fortran;CXX_HL;Fortran_HL;Java;Tools
--      HDF5_TOOLS_FOUND: h52gif;h5copy;h5debug;h5diff;h5dump;h5import;h5jam;h5ls;h5mkgrp;h5stat
-- Found HDF5: /Users/gdsjaar/src/seacas/include
-- NetCDF does not require PNetCDF
-- Found NetCDF: /Users/gdsjaar/src/seacas/lib/libnetcdf.dylib;/Users/gdsjaar/src/seacas/lib/libhdf5_hl.dylib;/Users/gdsjaar/src/seacas/lib/libhdf5.dylib
-- NetCDF Version: 4.9.0
--      NetCDF_NEEDS_HDF5        = yes
--      NetCDF_NEEDS_PNetCDF     = no
--      NetCDF_PARALLEL          = False
--      NetCDF_INCLUDE_DIRS      = /Users/gdsjaar/src/seacas/include;/Users/gdsjaar/src/seacas/include
--      NetCDF_LIBRARIES         = /Users/gdsjaar/src/seacas/lib/libnetcdf.dylib;/Users/gdsjaar/src/seacas/lib/libhdf5_hl.dylib;/Users/gdsjaar/src/seacas/lib/libhdf5.dylib
--      NetCDF_BINARIES          = ncdump;ncgen;nccopy
-- Netcdf_LIBRARY_NAMES='netcdf'
-- TPL_Netcdf_LIBRARIES='/Users/gdsjaar/src/seacas/lib/libnetcdf.dylib;/Users/gdsjaar/src/seacas/lib/libhdf5_hl.dylib;/Users/gdsjaar/src/seacas/lib/libhdf5.dylib'
-- TPL_Netcdf_INCLUDE_DIRS='/Users/gdsjaar/src/seacas/include;/Users/gdsjaar/src/seacas/include'
-- TPL_Netcdf_PARALLEL is False

Doesn't that show that Netcdf is being found and then since netcdf depends on HDF5, then HDF5 is found/processed?

Note that HDF5 is an optional dependency wherever it is mentioned in the SEACAS Dependencies.cmake:

packages/seacas/libraries/exodus/cmake/Dependencies.cmake:  LIB_OPTIONAL_TPLS Pthread HDF5 Pnetcdf MPI

Thanks for trying to explain.

bartlettroscoe commented 1 year ago

@gsjaardema, can you please attach the full cmake STDOUT+STDERR?

What are the dependencies of Pnetcdf? Do we need to add FindTPL<tplName>Dependencies.cmake entries for those as well?

gsjaardema commented 1 year ago

What are the dependencies of Pnetcdf? Do we need to add FindTPL<tplName>Dependencies.cmake entries for those as well?

PnetCDF only depends on MPI and netCDF optionally depends on PnetCDF in a parallel build. SEACAS only accesses / depends on PnetCDF if netCDF was configured in parallel and with PnetCDF capability enabled. There isn't any direct access from SEACAS that bypassed netCDF and hits PnetCDF.

gsjaardema commented 1 year ago

@gsjaardema, can you please attach the full cmake STDOUT+STDERR? seacas-config-debug.txt

bartlettroscoe commented 1 year ago

PnetCDF only depends on MPI and netCDF optionally depends on PnetCDF in a parallel build. SEACAS only accesses / depends on PnetCDF if netCDF was configured in parallel and with PnetCDF capability enabled. There isn't any direct access from SEACAS that bypassed netCDF and hits PnetCDF.

Then this dependency between the TriBITS Pnetcdf and Netcdf TPLs should be expressed by updating the file https://github.com/sandialabs/seacas/blob/master/cmake/tribits/common_tpls/FindTPLNetcdfDependencies.cmake to be:

tribits_extpkg_define_dependencies( Netcdf
  DEPENDENCIES  Pnetcdf HDF5)

In case we do turn on TPL dependencies by default, this might get changed to:

set(Netcdf_Pnetcdf_dep "")
if (TPL_ENABLE_MPI)
  set(Netcdf_Pnetcdf_dep  Pnetcdf)
endif()
tribits_extpkg_define_dependencies( Netcdf
  DEPENDENCIES  ${Netcdf_Pnetcdf_dep} HDF5)

That would mean that if the user set TPL_ENABLE_Netcdf=ON and TPL_ENABLE_MPI=ON on input, then TriBITS would try to automatically enable Pnetcdf (once we turn on indirect upstream TPL dependency enables).

How common is it to enable MPI but not build Netcdf with Pnetcdf? That is, in the future, do we want the user setting TPL_ENABLE_Netcdf=ON and TPL_ENABLE_MPI=ON to automatically enable Pnetcdf? Note that the user will always be able to set TPL_ENABLE_Pnetcdf=OFF and the Pnetcdf TPL will not get enabled. Is that logical behavior? What is the least surprising behavior?

bartlettroscoe commented 1 year ago

@gsjaardema, the problem is that the HDF5 TPL is not being enabled as can be see in the output seacas-config-debug.txt showing:

Final set of enabled external packages/TPLs:  Netcdf CGNS METIS Matio X11 DLlib fmt 7

Final set of non-enabled external packages/TPLs:  GTest Zlib Pthread MPI HDF5 Pnetcdf DataWarp ParMETIS Pamgen CUDA Kokkos Faodel Cereal ADIOS2 Catalyst2 15

Someone needs to set TPL_ENABLE_HDF5=ON.

Currently, TriBITS is not automatically enabling the HDF5 TPL because of the Netcdf dependency on it (because it would break backward compatibility) but in the future it could.

Can you add -D TPL_ENABLE_HDF5:BOOL=ON to the configure scripts? Or should we bite the bullet and have TriBITS automatically enable upstream TPL dependencies as described above? (The latter will break some configure scripts but that is easy to fix by just doing a hard disable with -D TPL_ENABLE_<tpl-that-should-be-disabled>=OFF.)

gsjaardema commented 1 year ago

... deleted... Then this dependency between the TriBITS Pnetcdf and Netcdf TPLs should be expressed by updating the file https://github.com/sandialabs/seacas/blob/master/cmake/tribits/common_tpls/FindTPLNetcdfDependencies.cmake to be:

tribits_extpkg_define_dependencies( Netcdf
  DEPENDENCIES  Pnetcdf HDF5)

I still confused as to why the netCDF configuration isn't the place where this information is handled. That is the current behavior and it seems to be working in most situations...

netCDF has a few optional dependencies depending on how it is configured:

We will also have that issue with HDF5 soon if we start adding some of the advanced HDF5 capabilties (VOLs will add optional dependencies); multiple compression library options.

As I see it, only the netCDF configuration files can fully know what optional dependencies are present in the library...

bartlettroscoe commented 1 year ago

As I see it, only the netCDF configuration files can fully know what optional dependencies are present in the library...

The netCDFConfig.cmake file is broken. It does not contain a call to find_dependency(HDF5). @ibaned knows more about that.

TriBITS allows us to fix issues with these external packages when they are not doing the correct behavior.

gsjaardema commented 1 year ago

The netCDFConfig.cmake file is broken. It does not contain a call to find_dependency(HDF5). @ibaned knows more about that.

TriBITS allows us to fix issues with these external packages when they are not doing the correct behavior.

Ok, that was the missing piece for me --

ibaned commented 1 year ago

Hi! I'm coming in late to the conversation, I gather essentially that this PR broke something? Could we maybe open an issue on the SEACAS or TriBITS repositories just to track what broke and I can get a better understanding?

gsjaardema commented 1 year ago

@ibaned Nothing broke that I am aware of, I am just not sure it is working the way it should -- I am unable to get the MODERN config to work due to the way that the HDF5 dependency is currently added via netCDF and not as a direct SEACAS dependency.

ibaned commented 1 year ago

Ah I see! If it helps, here is the CMake configuration that I'm using:

  "-DSeacas_ENABLE_Fortran=OFF"
  "-DSeacas_ENABLE_SEACASExodus=ON"
  "-DSEACASExodus_ENABLE_MPI=ON"
  "-DSEACASExodus_ENABLE_SHARED=OFF"
  "-DSeacas_ENABLE_SEACASAprepro_lib=ON"
  "-DTPL_ENABLE_HDF5=ON"
  "-DNetcdf_ALLOW_MODERN=ON"
  "-DnetCDF_ROOT=${CAPP_INSTALL_ROOT}/netcdf-c"
  "-DHDF5_ROOT=${CAPP_INSTALL_ROOT}/hdf5"
  ${SEACAS_MPI_OPTIONS}
  "-DSeacas_VERBOSE_CONFIGURE=OFF"
  "-DTRIBITS_TPL_FIND_INCLUDE_DIRS_AND_LIBRARIES_VERBOSE=OFF"
  "-DBUILD_SHARED_LIBS=OFF"
gsjaardema commented 1 year ago

It is probably the "-DTPL_ENABLE_HDF5=ON" that does it.

Also, do you know what HDF5-1.10.X is missing to allow it to use the MODERN method...

ibaned commented 1 year ago

Maybe nothing! I was being overly conservative in listing just the versions installed on my system.

bartlettroscoe commented 1 year ago

@gsjaardema, can we set up a short MS Teams meeting so I can explain to you what is happening and how TriBITS is fixing this?

bartlettroscoe commented 1 year ago

TriBITS allows us to fix issues with these external packages when they are not doing the correct behavior.

Ok, that was the missing piece for me --

@gsjaardema, the key is that when setting TPL_ENABLE_HDF5=ON, TriBITS generates a file NetcdfConfig.cmake in the build and install trees that basically does:

set(HDF5_DIR ...)
find_dependency(HDF5)
set(netCDF_DIR ...)
find_dependency(netCDF)
add_library(Netcdf::all_libs IMPORTED INTERFACE)
...

Look at the file:

<buildDir>/external_packages/Netcdf/NetcdfConfig.cmake

and you will see this.

And it is possible to extend this TriBITS support to automatically fix other situations as well.