NOAA-GFDL / MOM6-examples

Example configurations for MOM6 and SIS2
Other
87 stars 147 forks source link

Need help for openmp #231

Open vineetmoraybfab opened 6 years ago

vineetmoraybfab commented 6 years ago

Hi, I want to run the application on Intel's KNL arch using threads. I have created FMS, Ocean only and SIS using the online documentation and with providing flag OPENMP=on (also with icc flags: -O3 -axCOMMON-AVX512) while running make utility. Everything goes fine. Afterwards when i run the application (ocean_only benchmark with NIGLOBAL = 720, NJGLOBAL = 360 and days = 20 provided in the benchmark) using: export OMP_NUM_THREADS=4 ulimit -s unlimited mpirun -np 68 -genv OMP_NUM_THREADS=4 ../../build/gnu/ocean_only_repro/MOM6 I am not able to get an expected performance. Also it seems I am using 25% of the CPU. So is it that the application is not using OMP threads or is it that I am doing something wrong? Thanks

nikizadehgfdl commented 6 years ago

OMP performance in MOM6 is still under investigation/development.

vineetmoraybfab commented 6 years ago

Ok, no problem.. I will be waiting :-) Thanks

Youwei-Ma commented 1 year ago

Hi.

An instability warning, 'btstep: eta has dropped below bathyT' shows at the beginning of an ocean_only/global benchmark in an OPENMP and openMPI hybrid run.

The machine is Cheyenne. I successfully compiled the FMS2 and MOM6 by intel2022.1 with flag -qopenmp in the make file and OPENMP=1 in the compile scripts, for example ../../../../src/mkmf/bin/list_paths -l $ROOTDIR/FMS2; \ ../../../../src/mkmf/bin/mkmf -t ../../cheyenne-intel.mk -p libfms.a -c "-Duse_libMPI -Duse_netCDF -DSPMD" path_names) (cd /shared_openmp/repro/; make NETCDF=4 OPENMP=1 REPRO=1 libfms.a -j) and ../../../../src/mkmf/bin/list_paths -l ./ ../../../../src/MOM6/{config_src/infra/FMS2,config_src/memory/dynamic_symmetric,config_src/drivers/solo_driver,config_src/external,src/{*,*/*}}/ ; \ ../../../../src/mkmf/bin/mkmf -t ../../cheyenne-intel.mk -o '-I../../shared_openmp/repro' -p MOM6 -l '-L../../shared_openmp/repro -lfms' path_names) (cd ./ocean_only_openmp/repro; make NETCDF=4 REPRO=1 OPENMP=1 MOM6 -j)

I hired 4 nodes, each node with 2 sockets and 18 CPUs per socket, so each node has 2 MPI processes and 18 threads per socket. Part of my PBS job script shown below:

#PBS -l select=4:ncpus=36:mpiprocs=2:ompthreads=18  
module load intel/2022.1  
module load openmpi/4.1.1  
module load netcdf/4.9.0  
....  
ulimit -s unlimited  
mpiexec --map-by ppr:2:node:pe=18 --report-bindings /glade/u/home/youweima/MOM6-examples/build/intel2022.1/ocean_only_openmp/repro/MOM6

The fatal message from the output is

WARNING from PE     2: btstep: eta has dropped below bathyT:  -9.7544129261137066E+09 vs.  -4.7494283999060144E+02 at  -6.0500E+01 -7.6250E+01    221      3

FATAL from PE     1: NaN in input field of reproducing_EFP_sum(_2d).

The full output file is attached below.

ocean_only_global_test.o8066956.txt

Is there anything I missed? I would be appreciated if someone has any idea about this issue. Thanks!

marshallward commented 1 year ago

@Youwei-Ma It looks like you have correctly compiled the model with OpenMP, so I don't think you have made any error.

More likely, you may have found a bug in the OpenMP implementation somewhere, perhaps an uninitialized variable inside one of the OpenMP loops.

As mentioned by @nikizadehgfdl, we don't actually use OpenMP in production with MOM6. Although some parts have been configured to use it, we do not see any speedup from using OpenMP threads and haven't pursued it any further.

You could submit this as a bug report to the MOM6 repository, but I can't say when we would have a chance to look into it.