Closed mgduda closed 3 weeks ago
@weiwangncar I think you'll be able to reproduce the issue on Derecho by starting with the following commands:
module reset
module load intel/2024.0.2
module load parallel-netcdf
export INTEL_COMPILER_TYPE=ONEAPI
(changing the export
to a setenv
if you use csh or tcsh) before compiling the master
branch with
make intel CORE=init_atmosphere
Then, you can try to run the static interpolation stage with 16 MPI ranks using the files in /glade/derecho/scratch/duda/pr1189/
.
@mgduda It looks like this is a problem with Intel oneapi - I didn't encounter it before because I often just use regular Intel/ifort. Using Intel oneapi, the same error appears with default intel/2023.2.1, as well as intel/2024.0.2. And the fix works for both.
@weiwangncar If the fix in this PR looks good to you, could you approve this PR?
This PR improves the check on the deallocation of the geotile manager hash table in
mpas_geotile_mgr_finalize
, resolving apparent issues with this deallocation under some conditions.In some cases (typically with the Intel oneAPI compilers), parallel remapping of static fields in the init_atmosphere core will fail with the message
for some MPI ranks. There is apparently a problem in deallocating the
hash
member ofmpas_geotile_mgr_type
instances inmpas_geotile_mgr_finalize
.This PR improves the checks on the deallocation of
mgr % hash
in thempas_geotile_mgr_finalize
routine, making them more stringent. With the modifications to the deallocation checks, the deallocation errors no longer occur, suggesting that they were entirely spurious.