MESH-Model / MESH-Dev

This repository contains the official MESH development code, which is the basis for the 'tags' listed under the MESH-Releases repository. The same tags are listed under this repository. Legacy branches and utilities have also been ported from the former SVN (Subversion) repository. Future developments must create 'forks' from this repository.
Other
2 stars 3 forks source link

crash when writing basin output #56

Open mee067 opened 2 months ago

mee067 commented 2 months ago

Code compiled earlier - maybe a month or two ago did work. Re-compiled the same code (using intel 2018 and 2021), and I am getting strange crashes (Segmentation Fault). I recompiled with symbols on but only intel 2021 gives some clue, the 2018 compilation gave no info on where's the issue. This is the dump of the error:

forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image              PC                Routine            Line        Source
mpi_sa_mesh_2021_  0000000000B6C8DA  Unknown               Unknown  Unknown
libpthread-2.30.s  00001555554270F0  Unknown               Unknown  Unknown
mpi_sa_mesh_2021_  0000000000AC7B0D  save_basin_output         946  save_basin_output.f90
mpi_sa_mesh_2021_  0000000000B44760  MAIN__                   1031  MESH_driver.f90
mpi_sa_mesh_2021_  000000000040CC92  Unknown               Unknown  Unknown
libc-2.30.so       0000155554726E1B  __libc_start_main     Unknown  Unknown
mpi_sa_mesh_2021_  000000000040CBAA  Unknown               Unknown  Unknown

This is a gridded setup. I first thought it has something to do with the LongSimFix but it does not. Line 946 of save_basin_output.f90 reads:

                    if (WF_RTE_fstflout%fout_hyd) write(iun, 1010, advance = 'no') &
                    fms%stmg%qomeas%val(i), out%d%grid%qo(fms%stmg%meta%rnk(i))

I did a bit of debugging and found that the rank of the third gauge went crazy to be 1112486707, while the basin only has 3448 active gridcells.

Any ideas?

mee067 commented 2 months ago

This crash is similar to the https://github.com/MESH-Model/MESH-Dev/issues/22

The same stupid number: 1112486707 got assigned to the rank gauge #3 - not sure where and how.

dprincz commented 1 month ago

@mee067 provided sample setup

mee067 commented 6 days ago

any progress on this issue? It is holding me from trying any new stuff. Compiling works but running hits the issue so often.