E3SM-Project / scorpio

A high-level Parallel I/O Library for structured grid applications
18 stars 16 forks source link

Error finding offset types during CMake setup #552

Closed philipwjones closed 6 months ago

philipwjones commented 7 months ago

When adding scorpio to Omega's cmake build, I get the following error during the initial cmake phase:

CMake Error at /home/ac.pjones/Omega/externals/scorpio/cmake/SPIOTypeUtils.cmake:172 (message): Could not find a Fortran type for passing PIO Offsets from Fortran to C Call Stack (most recent call first): /home/ac.pjones/Omega/externals/scorpio/src/clib/CMakeLists.txt:170 (get_pio_offset_type)

Any ideas? This is on Chrysalis with Intel compilers. The mpi-enabled compilers and netcdf libs seem to be found and set correctly. But it seems like it isn't finding a header/file somewhere.

And why is this check being done for a C-only build?

jayeshkrishna commented 7 months ago

The error shows that the SCORPIO configure was not able to find a Fortran type for passing PIO offsets from Fortran interface to the C library. One possibility is that the Fortran compiler settings (compiler path, options) are not right.

(PS: Currently we assume that the Fortran interface is always built with SCORPIO)

grnydawn commented 7 months ago

@jayeshkrishna , My Omega build with Scorpio was successful on Frontier and Perlmutter but failed on Chrysalis in the same way that Phil experienced. From comparing the log messages between Frontier and Chrysalis, it seems that the issue was caused by not using the MPI compiler wrapper for try_compile in SPIOTypeUtils.cmake. On Chrysalis, try_compile uses ifort, but on Frontier, ftn is used. How can I set the default Fortran compiler for the Scorpio build? I've tried setting CMAKE_FC_COMPILER and ENV{FC}, but haven't had any luck yet.

jayeshkrishna commented 7 months ago

You should set the Fortran compiler the same way as you set the C/CXX compiler (Does setting CMAKE_FC_COMPILER work? Does setting env FC/F77 work? )

grnydawn commented 7 months ago

@jayeshkrishna , @philipwjones , It seems that one of the reasons for this issue is caused by the somewhat unusual MPI configuration on Chrysalis. Typically, "mpi.mod" is located in the "include" directory, alongside "mpif.h". However, on Chrysalis, "mpi.mod" is in the "lib" directory. To build Omega with Scorpio, I had to set "MPI_MOD_PATH" to a MPI "lib" directory where "mpi.mod" is on Chrysalis. With that modification and other Netcdf settings, I could successfully build Omega with Scorpio on Chrysalis.

Unfortunately, with this "MPI_MOD_PATH" setting, other parts of the Scorpio build, such as GPTL, fail because GPTL build cannot find "mpif.h", which is in the "include" directory, not the "lib" directory. Therefore, I had to disable the building of GTPL, along with other Scorpio builds such as Scorpio testing and Scorpio tools. Currently, Omega is not using these builds.

E3SM, however, managed to build Scorpio on Chrysalis without encountering this issue. It appears that E3SM uses "$SRCROOT/share/build/buildlib.spio" to build Scorpio. I am trying to understand the Python script used there.

jayeshkrishna commented 7 months ago

@grnydawn : I would suggest first ensuring that you are loading the modules (and setting the env vars) used by E3SM for chrysalis in config_machines.xml . I am not sure if we ever set the MPI module path explicitly.

grnydawn commented 7 months ago

@jayeshkrishna, yes, I think that we should avoid specifying MPI module path explicitly. In Omega, we are not directly using CIME build system, but read the module settings and env. settings from config_machines.xml, and populate the settings for Scorpio build. Because buildlib.spio seems to be the one that forwards MPI and other settings from CIME to Scorpio cmake build, I think I will need to dig more on the script.

philipwjones commented 7 months ago

@grnydawn - I wonder if the easiest approach might be to have FC be set the same as MPIFC. I don't think we'll ever be running without MPI for anything. Not sure how this impacts any other of the CMake config tests. That would guarantee all the paths are consistent with the local MPI installation.

grnydawn commented 7 months ago

@philipwjones , I think I have tried similar approach by setting FC to mpif90 but failed at try_compile Cmake function. For some reason, try_compile function always uses the underlining compiler (ifort) instead of mpif90 compiler wrapper.

grnydawn commented 6 months ago

@jayeshkrishna , @philipwjones , I think I fixed this issue at the PR: https://github.com/E3SM-Project/Omega/pull/40 . The issue was caused by improperly setting Fortran compiler and mixing Linux environmental variables from both of invoking shell and E3SM machine configuration. Please check the PR for details.

philipwjones commented 6 months ago

Thanks @grnydawn - verified this fixed the issue on Chrysalis and did not impact build on frontier. Closing.