DOI-USGS / COAWST

COAWST modeling system git repository
Other
100 stars 48 forks source link

Can not build COAWST 3.8 on HPC (CRAY) #255

Open ZenKa24 opened 1 month ago

ZenKa24 commented 1 month ago

Hi all, I am trying to compile COAWST v3.8 (Inlet_test/Coupled case) on HPC, but it has not been successful. I have attempted to both ifort and gfortran to compile, and in both cases, the compilation process is stopped at the same step. I have installed MCT successfully, but a mct_param error occurred when I compiled it. Please check the log file (build-v3.8.log) attached below to see more details. Besides, I have installed successful COAWST v3.4 (Inlet_test/Coupled case) on the same system. However, it seems that there is a problem with MPI when I run coawstM with multi-processor by qsub. Can you tell me what the problem is and what I should do to fix it? I hope I can solve the problem based on your experience. I am looking forward to hearing form you soon. image

image

Many thank, build-v3.8.log out-GOT.txt

jcwarner-usgs commented 1 month ago

The Inet_test is a case we run all the time to test the model. so hopefully we can get it to work on your system.

I would prefer not to try and fix an issue in coawst v3.4.

Looking at the v3.8, others have had trouble with cmake. You also need to make sure MCT is installed and the paths set. what do you get when you type: $MCT_INCDIR

ZenKa24 commented 1 month ago

Dear Mr.Warner,

Thank you for replying! MCT is already in my system, and I am using cmake version 3.14.4. image

aiwuyouxi commented 1 month ago

Hi,

I've encountered an issue while compiling COAWSTv3.8; it has come to a halt at same step, similar to a situation you've previously encountered. 1111

Additionally, I have successfully installed MCT. My current development environment is as follows:

gfortran version: v11.4.0 cmake version: v3.22.1

This is my build_coawst.sh script and the log file. build_coawst_sh.txt build_log.txt

Thank you!

jcwarner-usgs commented 1 month ago

sometimes cmake has an issue when the directory has a "-" in it. I see you have COAWST-COAWST_v3.8 rename the dir to not have a "-" in it.

Are you the one who made a similar post. If so, can you close the other one.

aiwuyouxi commented 1 month ago

sometimes cmake has an issue when the directory has a "-" in it. I see you have COAWST-COAWST_v3.8 rename the dir to not have a "-" in it.

Are you the one who made a similar post. If so, can you close the other one.

Thank you for your advice. I will make the necessary changes to the directory name to avoid using the hyphen and see if that resolves the issue with cmake. Additionally, I confirm that I am the one who posted the similar inquiry, and I will closing the duplicate post.

ZenKa24 commented 1 month ago

sometimes cmake has an issue when the directory has a "-" in it. I see you have COAWST-COAWST_v3.8 rename the dir to not have a "-" in it.

Are you the one who made a similar post. If so, can you close the other one.

I fixed the error above based on your advice. However, I faced another problem. I have tried to fix it but unsuccessful. Please help me check it!

image

build-v3.8.1.log build_coawst_GOT.txt Linux-ifort.txt

Thanks,

jcwarner-usgs commented 1 month ago

not sure. It starts to build swan then has a lot of errors. i think you need to scroll up and look at the first error, not the last. i see: Cannot disable Fortran error message 7013 Cannot disable Fortran error message 6592 Cannot disable Fortran error message 6259 Cannot disable Fortran error message 6169 Cannot disable Fortran error message 6592 ...

not sure what these errors are. Try this:

./build_coawst -j 8 &> build.out

that will write all the error flags to the out file also. then post the build.out

ZenKa24 commented 1 month ago

Here is new out file. It seems that it has trouble with NetCDF build-v3.8.2.log

jcwarner-usgs commented 1 month ago

"/cray_home/haivan/operation/ROMS/COAWST/SWAN/src_coawst/nctablemd.f90(25): error #7013: This module file was not generated by any release of this compiler. [NETCDF] use netcdf"

The compiler is finding the netcdf, but it was built by a different compiler.

Cmake is finding a netcdf that was built with gfortran:

but the fortran is The Fortran compiler identification is Intel 18.0.1.20171018

so there is a conflict. Suggest you do try to build SWAN by itself, from the SWAN site. do this:

make a new dir called SWAN (outside of the COAWST dir)

see if that works

ZenKa24 commented 1 month ago

I already have the swan model (which has already been compiled) on my system. However, It was not compiled by cmake. So, what should I do next?

jcwarner-usgs commented 1 month ago

Well, i can try to help here, but in the end you are going to have to figure this out.

I suggested you do the SWAN by itself to see what compiler and netcdf libs are being used. can you post that build? this will help to figure out what is going on with the COAWST build.

ZenKa24 commented 1 month ago

I have rebuilt SWAN according to your suggestion and got the error below. It seems that all files have passed but it can not create swan.exe. make_swan.txt

image

jcwarner-usgs commented 1 month ago

at the end you have

/opt/cray/pe/craype/2.5.14/bin/ftn -O2 -W0 -assume byterecl -traceback -diag-disable 8290 -diag-disable 8291 -diag-disable 8293 CMakeFiles/swan.exe.dir/swanmain.f.o -o ../bin/swan.exe libswan41.45.a /opt/cray/pe/netcdf/4.4.1.1.6/INTEL/16.0/lib -L/opt/cray/pe/netcdf/4.4.1.1.6/INTEL/16.0/lib -lnetcdff -L/opt/cray/pe/hdf5/1.10.1.1/INTEL/16.0/lib -L/opt/cray/pe/hdf5/1.10.1.1/INTEL/16.0/lib -lnetcdf /cray_home/haivan/.local/CDO/lib/libnetcdf.a -L/opt/cray/pe/netcdf/4.4.1.1.6/INTEL/16.0/lib -lnetcdf

/usr/lib64/gcc/x86_64-suse-linux/4.8/../../../../x86_64-suse-linux/bin/ld: /opt/cray/pe/netcdf/4.4.1.1.6/INTEL/16.0/lib: file not recognized: Is a directory

Can you ask someone for help with this? It has to deal with nf-config, and how the netcdf libraries are set up on your cluster.

ZenKa24 commented 1 month ago

I have fixed the errors above and passed compiling SWAN (swan.exe has already), but I got the new error when compiling ROMS. It seems that related to ifort and a couple of issues. Error messages are shown in the picture below. Please show me how to pass this error!

Best regards, image

jcwarner-usgs commented 1 month ago

ok good you are making progress. let me see your .h file. You seem to be asking for some fields that are not available, and that is most likely due to some mismatch in the ifdefs

ZenKa24 commented 1 month ago

Here is my inlet_test.h file. I copied this file from the ROMS/Include folder, which is different from the original .h file in the Inlet_test/Coupled folder. When I compiled the model with the original file, I got the errors M_COUPLING and swan_iounits. With the new .h file, I have passed these errors because I found that the original *.h file defined SWAN_MODEL, but the new file defined SWAN_COUPLING. Please check the issue and correct it if it has any confusion.

Many thank, inlet_test.txt

jcwarner-usgs commented 1 month ago

you need to use the inlet_test.h in the Projects/Inlet_test/Coupled folder. (or any of the other folders in the Projects/ folder. if you got errors from that, then we need to fix them.

ZenKa24 commented 1 month ago

If I use the inlet_test.h in the Projects/Inlet_test/Coupled folder I got the error below. image

I then changed "define SWAN_MODEL" to "define SWAN_COUPLING", the error above has passed and gets a new error following the picture below

image

jcwarner-usgs commented 1 month ago

please do not change SWAN_MODEL to swan_coupling. I do things differently than on the Rutgers site. Please use the cpp options listed in the manual and in the Projects folder. I am not going to fix the second method.
We need to go back to the first method, define SWAN_MODEL and post the full output of the build. do this. script build.log ./build_coawst.sh -j 6 (or whatever number) exit and then post the build.log here.

ZenKa24 commented 1 month ago

I have built COAWST using the first method. Please check the errors and the build file below.

inlet_test.txt build_coawst_GOT.txt build_ifort_log.txt image

jcwarner-usgs commented 1 month ago

"

CMake Error at CMakeLists.txt:6 (cmake_minimum_required): CMake 3.12 or higher is required. You are running version 3.5.2 -- Configuring incomplete, errors occurred! make[1]: Entering directory '/lus/dal/haivan/operation/ROMS/COAWST/SWAN/build' make[1]: *** No targets specified and no makefile found. Stop. make[1]: Leaving directory '/lus/dal/haivan/operation/ROMS/COAWST/SWAN/build'"

ZenKa24 commented 1 month ago

Sorry, I forgot to export the newest version cmake. Please check the attached file below to see the errors

build_ifort_log20240607.txt

jcwarner-usgs commented 1 month ago

i see: "/cray_home/haivan/operation/ROMS/COAWST/SWAN/src_coawst/nctablemd.f90(25): error #7013: This module file was not generated by any release of this compiler. [NETCDF] use netcdf ------------^"

so that means the netcdf module being loaded was not built by the compiler you are using now. For the SWAN build i see

"- The Fortran compiler identification is Intel 18.0.1.20171018" and then it lists "-- Found NetCDF: /opt/cray/pe/netcdf/4.4.1.1.6/GNU/5.1/include (found version "4.4.1.1") found components: Fortran "

So you are using ifort to build but linking against libraries built by gfortran.

What do you have set in the build_coawst.sh? can you try FORT=gfortran