ESCOMP / mizuRoute

Reach-based river routing model
http://escomp.github.io/mizuRoute/
GNU General Public License v3.0
39 stars 51 forks source link

Segmentation fault after compiling with openmpi #471

Closed pyested closed 1 week ago

pyested commented 1 week ago

Hi, thank you for this nice tool. So far I am able to run it even for the catchments that I am working with. Right now I would like to speed up the computation but I am having some trouble to run the simulation. I have compiled mizuRoute specifying isOpenMP=yes and the executable is successfully generated. However, if I try to use the exe compiled this way it I always get a segmentation fault error.

Any ideas why this is happening? OpenMPI is installed in my computer (Ubuntu 22). Below I send the output after running mizuRoute for the Cameo test case.

---- read control file ---

--> ./ancillary_data/ --> ./input/ --> ./output/ --> v1.2_case3 --> 1950-01-01 12:00:00 --> 1950-12-31 12:00:00 --> 0 --> last --> ntopo_nhdplus_cameo_pfaf.nc --> seg --> hru --> -9999 --> RUNOFF_case3.nc --> RUNOFF --> time --> lon --> lat --> time --> mm/s --> 86400 --> T --> spatialweights_grid12km_nhdplus-cameo.nc --> polyid --> weight --> i_index --> j_index --> overlaps --> polyid --> data --> param.nml.default --> HRU_AREA --> Length --> So --> idFeature --> hru2seg --> link --> to ---- calendar --- calendar will be read from RUNOFF_case3.nc time_unit will be read from RUNOFF_case3.nc ---- runoff unit --- runoff unit is provided as: mm/s WARNING: routOpt=0 is accumRunoff option now. 12 is previous 0 now ---- Read river network data --- Reading HRU_AREA into structure HRU Reading idFeature into structure HRU2SEG Reading hru2seg into structure HRU2SEG Reading Length into structure SEG Reading So into structure SEG Reading link into structure NTOPO Reading to into structure NTOPO ./runoff_route.submit: line 15: 42900 Segmentation fault (core dumped) ./bin/mizuroute.exe settings/v1.2/testCase_cameo_case${case}.control
nmizukami commented 1 week ago

Hi,

Need more information from you.

Which branch are you using? main or develop does not use openmpi, but can use openMP. Make sure openmpi and openMP are different.

How exactly did you compile? One suggestion is try compiling with MODE=debug to see if you get more information.

how you are running?? runoff_route.submit probably uses the job scheduler (i.e., torque or slurm) usually used for HPC or cluster. PC or laptop may not have the job scheduler. probably this may not be cause of segmentation fault though.

Were you able to run without openMP (or openMPI) for testCase? also which testCase are you using? should use https://zenodo.org/records/10108930.

pyested commented 1 week ago

Hi, thank you for your answer.

I'm using the main branch and the testCase v1.2 https://zenodo.org/records/7884836. I'm able to run it without OpenMP but when I compile with OpenMP I get the segmentation fault after running.

As for the compilation, you were right, I confused OpenMPI and OpenMP, but they are different. In any case, I'm using gfortran with -fopenmp flag after setting isOpenMP = yes and debug mode. I send attached the log from the compilation with OpenMP.

Running mizuroute is not a problem, I'm running it from the terminal without submitting any job and as I said it works well without OpenMP.

logmizu.txt

nmizukami commented 1 week ago

Hi, two things I can say are:

  1. check stack size in your computer by ulimit -a. This might be too small and try increasing it (or even unlimited). You can google "linux ulimit". This is just linux command and not related to the fortran code, but the code using openMP may use larger stacksize. I am not a computer scientist so I cannot explain much about this kind of stuff, but I have seen this kind of behavior.
  2. Also, you will need to set some openMP environment, e.g., (export OMP_NUM_THREADS=5 if you want to use 5 threads). You can learn openMP from google too.
pyested commented 1 week ago

Thanks a lot! After setting the stack size to unlimited (ulimit -s unlimited) I'm able to run it after compiling with OpenMP. That solves my question :)

I have another question, unrelated to this, but maybe you can answer here. I'm trying to find some parameter ranges for mann_n and wscale to calibrate them. As far as I know, these are used in KWT as well as in KW, MC and DW. Are there any recommendations for these parameters? Based of some literature research I've come up with the following ranges, but maybe you have a better recommendation:

mann_n: [0.024, 0.075] (from Cortés-Salazar et al., 2023) wscale_n: [0.0005, 0.01] (from the values in your paper)

nmizukami commented 1 week ago

I think the parameter ranges are ok. You may want to check the river reach width computed based on wscale_n after the calibration.

pyested commented 1 week ago

Thanks!

nmizukami commented 1 week ago

Hi, Actually mann_n could go higher than 0.075. maybe up to 0.3.