If you build nextsim with MPI=ON, and run the ThermoIntegration_test.py integration test it will produce the following error (on Linux):
Error Output
```
$ python ThermoIntegration_test.py
HDF5-DIAG: Error detected in HDF5 (1.8.21) thread 0:
#000: /tmp/melt/spack-stage/spack-stage-hdf5-1.8.21-dfgfamgva4yyx72fmjpldk33s3e6uap6/spack-src/src/H5A.c line 1638 in H5Aexists(): not a location
major: Invalid arguments to routine
minor: Inappropriate type
#001: /tmp/melt/spack-stage/spack-stage-hdf5-1.8.21-dfgfamgva4yyx72fmjpldk33s3e6uap6/spack-src/src/H5Gloc.c line 193 in H5G_loc(): invalid group ID
major: Invalid arguments to routine
minor: Bad value
HDF5-DIAG: Error detected in HDF5 (1.8.21) thread 0:
#000: /tmp/melt/spack-stage/spack-stage-hdf5-1.8.21-dfgfamgva4yyx72fmjpldk33s3e6uap6/spack-src/src/H5Adeprec.c line 176 in H5Acreate1(): not a location
major: Invalid arguments to routine
minor: Inappropriate type
#001: /tmp/melt/spack-stage/spack-stage-hdf5-1.8.21-dfgfamgva4yyx72fmjpldk33s3e6uap6/spack-src/src/H5Gloc.c line 193 in H5G_loc(): invalid group ID
major: Invalid arguments to routine
minor: Bad value
terminate called after throwing an instance of 'netCDF::exceptions::NcFileMeta'
what(): NetCDF: Can't add HDF5 file metadata
file: ncFile.cpp line:33
[lenny:23295] *** Process received signal ***
[lenny:23295] Signal: Aborted (6)
[lenny:23295] Signal code: (-6)
[lenny:23295] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7d9550bec520]
[lenny:23295] [ 1] /lib/x86_64-linux-gnu/libc.so.6(pthread_kill+0x12c)[0x7d9550c409fc]
[lenny:23295] [ 2] /lib/x86_64-linux-gnu/libc.so.6(raise+0x16)[0x7d9550bec476]
[lenny:23295] [ 3] /lib/x86_64-linux-gnu/libc.so.6(abort+0xd3)[0x7d9550bd27f3]
[lenny:23295] [ 4] /lib/x86_64-linux-gnu/libstdc++.so.6(+0xa2b9e)[0x7d9550e95b9e]
[lenny:23295] [ 5] /lib/x86_64-linux-gnu/libstdc++.so.6(+0xae20c)[0x7d9550ea120c]
[lenny:23295] [ 6] /lib/x86_64-linux-gnu/libstdc++.so.6(+0xae277)[0x7d9550ea1277]
[lenny:23295] [ 7] /lib/x86_64-linux-gnu/libstdc++.so.6(+0xae4d8)[0x7d9550ea14d8]
[lenny:23295] [ 8] /software/spack/opt/spack/linux-ubuntu22.04-skylake/gcc-11.4.0/netcdf-cxx4-4.3.1-kwvtkd7klice2xvellynedzj63eraxly/lib/libnetcdf_c++4.so.1(+0x26a2a)[0x7d9550aa3a2a]
[lenny:23295] [ 9] /software/spack/opt/spack/linux-ubuntu22.04-skylake/gcc-11.4.0/netcdf-cxx4-4.3.1-kwvtkd7klice2xvellynedzj63eraxly/lib/libnetcdf_c++4.so.1(_ZN6netCDF6NcFile5closeEv+0x33)[0x7d9550aab123]
[lenny:23295] [10] /home/melt/sync/cambridge/projects/current/sasip/nextsimdg/build-mpi/libnextsimlib.so(_ZN7Nextsim10ParaGridIO5closeERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x4e)[0x7d955276f43e]
[lenny:23295] [11] /home/melt/sync/cambridge/projects/current/sasip/nextsimdg/build-mpi/libnextsimlib.so(_ZN7Nextsim10ParaGridIO13closeAllFilesEv+0x9f)[0x7d955276f50d]
[lenny:23295] [12] /lib/x86_64-linux-gnu/libc.so.6(+0x45495)[0x7d9550bef495]
[lenny:23295] [13] /lib/x86_64-linux-gnu/libc.so.6(on_exit+0x0)[0x7d9550bef610]
[lenny:23295] [14] /lib/x86_64-linux-gnu/libc.so.6(+0x29d97)[0x7d9550bd3d97]
[lenny:23295] [15] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7d9550bd3e40]
[lenny:23295] [16] ../nextsim(+0xc5f5)[0x5a2b7ab945f5]
[lenny:23295] *** End of error message ***
Aborted (core dumped)
E
======================================================================
ERROR: setUpClass (__main__.SingleColumnThermo)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/melt/sync/cambridge/projects/current/sasip/nextsimdg/test/ThermoIntegration_test.py", line 38, in setUpClass
subprocess.run(cls.executable + " --config-file " + cls.config_file, shell=True, check=True)
File "/home/melt/miniconda3/envs/nextsim/lib/python3.10/subprocess.py", line 526, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '../nextsim --config-file ThermoIntegration.cfg' returned non-zero exit status 134.
----------------------------------------------------------------------
Ran 0 tests in 45.091s
FAILED (errors=1)
```
~I am investigating why exactly this happens.~
The test fails because it relies on Paragrid which has not yet been parallelized. Therefore the single column test fails when built with MPI. This should be fixed when the MPI parallelization of Paragrid (#495) is finished.
If you build
nextsim
withMPI=ON
, and run theThermoIntegration_test.py
integration test it will produce the following error (on Linux):Error Output
``` $ python ThermoIntegration_test.py HDF5-DIAG: Error detected in HDF5 (1.8.21) thread 0: #000: /tmp/melt/spack-stage/spack-stage-hdf5-1.8.21-dfgfamgva4yyx72fmjpldk33s3e6uap6/spack-src/src/H5A.c line 1638 in H5Aexists(): not a location major: Invalid arguments to routine minor: Inappropriate type #001: /tmp/melt/spack-stage/spack-stage-hdf5-1.8.21-dfgfamgva4yyx72fmjpldk33s3e6uap6/spack-src/src/H5Gloc.c line 193 in H5G_loc(): invalid group ID major: Invalid arguments to routine minor: Bad value HDF5-DIAG: Error detected in HDF5 (1.8.21) thread 0: #000: /tmp/melt/spack-stage/spack-stage-hdf5-1.8.21-dfgfamgva4yyx72fmjpldk33s3e6uap6/spack-src/src/H5Adeprec.c line 176 in H5Acreate1(): not a location major: Invalid arguments to routine minor: Inappropriate type #001: /tmp/melt/spack-stage/spack-stage-hdf5-1.8.21-dfgfamgva4yyx72fmjpldk33s3e6uap6/spack-src/src/H5Gloc.c line 193 in H5G_loc(): invalid group ID major: Invalid arguments to routine minor: Bad value terminate called after throwing an instance of 'netCDF::exceptions::NcFileMeta' what(): NetCDF: Can't add HDF5 file metadata file: ncFile.cpp line:33 [lenny:23295] *** Process received signal *** [lenny:23295] Signal: Aborted (6) [lenny:23295] Signal code: (-6) [lenny:23295] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7d9550bec520] [lenny:23295] [ 1] /lib/x86_64-linux-gnu/libc.so.6(pthread_kill+0x12c)[0x7d9550c409fc] [lenny:23295] [ 2] /lib/x86_64-linux-gnu/libc.so.6(raise+0x16)[0x7d9550bec476] [lenny:23295] [ 3] /lib/x86_64-linux-gnu/libc.so.6(abort+0xd3)[0x7d9550bd27f3] [lenny:23295] [ 4] /lib/x86_64-linux-gnu/libstdc++.so.6(+0xa2b9e)[0x7d9550e95b9e] [lenny:23295] [ 5] /lib/x86_64-linux-gnu/libstdc++.so.6(+0xae20c)[0x7d9550ea120c] [lenny:23295] [ 6] /lib/x86_64-linux-gnu/libstdc++.so.6(+0xae277)[0x7d9550ea1277] [lenny:23295] [ 7] /lib/x86_64-linux-gnu/libstdc++.so.6(+0xae4d8)[0x7d9550ea14d8] [lenny:23295] [ 8] /software/spack/opt/spack/linux-ubuntu22.04-skylake/gcc-11.4.0/netcdf-cxx4-4.3.1-kwvtkd7klice2xvellynedzj63eraxly/lib/libnetcdf_c++4.so.1(+0x26a2a)[0x7d9550aa3a2a] [lenny:23295] [ 9] /software/spack/opt/spack/linux-ubuntu22.04-skylake/gcc-11.4.0/netcdf-cxx4-4.3.1-kwvtkd7klice2xvellynedzj63eraxly/lib/libnetcdf_c++4.so.1(_ZN6netCDF6NcFile5closeEv+0x33)[0x7d9550aab123] [lenny:23295] [10] /home/melt/sync/cambridge/projects/current/sasip/nextsimdg/build-mpi/libnextsimlib.so(_ZN7Nextsim10ParaGridIO5closeERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x4e)[0x7d955276f43e] [lenny:23295] [11] /home/melt/sync/cambridge/projects/current/sasip/nextsimdg/build-mpi/libnextsimlib.so(_ZN7Nextsim10ParaGridIO13closeAllFilesEv+0x9f)[0x7d955276f50d] [lenny:23295] [12] /lib/x86_64-linux-gnu/libc.so.6(+0x45495)[0x7d9550bef495] [lenny:23295] [13] /lib/x86_64-linux-gnu/libc.so.6(on_exit+0x0)[0x7d9550bef610] [lenny:23295] [14] /lib/x86_64-linux-gnu/libc.so.6(+0x29d97)[0x7d9550bd3d97] [lenny:23295] [15] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7d9550bd3e40] [lenny:23295] [16] ../nextsim(+0xc5f5)[0x5a2b7ab945f5] [lenny:23295] *** End of error message *** Aborted (core dumped) E ====================================================================== ERROR: setUpClass (__main__.SingleColumnThermo) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/melt/sync/cambridge/projects/current/sasip/nextsimdg/test/ThermoIntegration_test.py", line 38, in setUpClass subprocess.run(cls.executable + " --config-file " + cls.config_file, shell=True, check=True) File "/home/melt/miniconda3/envs/nextsim/lib/python3.10/subprocess.py", line 526, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '../nextsim --config-file ThermoIntegration.cfg' returned non-zero exit status 134. ---------------------------------------------------------------------- Ran 0 tests in 45.091s FAILED (errors=1) ```~I am investigating why exactly this happens.~
The test fails because it relies on Paragrid which has not yet been parallelized. Therefore the single column test fails when built with MPI. This should be fixed when the MPI parallelization of Paragrid (#495) is finished.
The single column test runs when
MPI=OFF