NCAR / tiegcm

Other
3 stars 2 forks source link

seg fault when running benchmark run dec2006_heelis_gpi #20

Open mfleduc opened 2 weeks ago

mfleduc commented 2 weeks ago

Hello! I am having an issue trying to run the dec2006_heelis_gpi benchmark run in an interactive job on derecho. I have followed the documentation to ensure that I have the proper modules, python packages, etc loaded in my conda environment.

Upon running the command

python $TIEGCMHOME/tiegcmrun/tiegcmrun.py -bench dec2006_heelis_gpi -c -e

I am prompted to use the command

mpirun $TIEGCMHOME/exec/tiegcm.exe $TIEGCMHOME/stdout/dec2006_heelis_gpi_2.5x0.25.inp

consistent with the data directory that I have supplied though $TIEGCMDATA. I have attached the input file as well. This makes a copy of the source file in the /hist/ directory with a few other fields (corresponding to input files) and then the model segfaults.

I have already tried setting ulimit -s unlimited as advised in other places and have not had any success. I have the same issue if I submit the decsol_smin job as a batch job. The .inp and output files are attached as well.

Has anyone else seen this issue?

dec2006_heelis_gpi_2.5x0.25.inp.txt error from tiegcm.txt decsol_smax_2.5x0.25.inp.txt decsol_smax.out.txt

hzfywhn commented 2 weeks ago

I notice that you are using a different data directory than we hosted. Can you try /glade/campaign/hao/itmodel/tiegcm3.0/data to see if it proceeds?

hzfywhn commented 2 weeks ago

Are you trying to do a REMIX driven TIEGCM run? If it is, please check if the "msphere.mix.h5" is in the same folder as the TIEGCM executable. If not, comment out the MIXFILE line in the input file.

On Thu, Aug 29, 2024 at 3:41 PM Matthew LeDuc @.***> wrote:

Thank you! The seg fault still occurs if I use data from that folder.

— Reply to this email directly, view it on GitHub https://github.com/NCAR/tiegcm/issues/20#issuecomment-2319073392, or unsubscribe https://github.com/notifications/unsubscribe-auth/A2TQOKECUKJJVW5CKRXGHMLZT6IPBAVCNFSM6AAAAABNJGPRFOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMJZGA3TGMZZGI . You are receiving this because you commented.Message ID: @.***>

mfleduc commented 2 weeks ago

The run still segfaults if I use the suggested directory and if I comment out the MIXFILE line in the input file. I do not have any file of that name anywhere, should it have been automatically generated?

hzfywhn commented 2 weeks ago

MIXFILE is an optional input. I actually successfully did a run based on your input file, now I believe your failing run is due to the corrupted source file. Please regenerate it using tiegcmrun based on /glade/campaign/hao/itmodel/tiegcm3.0/data/2.5x0.25_z11/decsol_f70.nc

On Fri, Aug 30, 2024 at 9:30 AM Matthew LeDuc @.***> wrote:

The run still segfaults if I use the suggested directory and if I comment out the MIXFILE line in the input file. I do not have any file of that name anywhere, should it have been automatically generated?

— Reply to this email directly, view it on GitHub https://github.com/NCAR/tiegcm/issues/20#issuecomment-2321625447, or unsubscribe https://github.com/notifications/unsubscribe-auth/A2TQOKH4KS6JR6GQF3L2ZK3ZUCF25AVCNFSM6AAAAABNJGPRFOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMRRGYZDKNBUG4 . You are receiving this because you commented.Message ID: @.***>

AnonNick commented 2 weeks ago

The MIXFILE seems to have a faulty default. I'll push a fix to not populate that parameter. You should be able to remove / comment out that line and do a run. Thank you for figuring it out.

Due to a recent update the source files that were previously being used for TEIGCM are showing errors. You should find the list of working SOURCE files here: /glade/campaign/hao/itmodel/tiegcm3.0/data/2.5x0.25_z11/ For benchmarks, if the benchmark name is _smax use _f200.nc files if its _smin use _f70.nc files.

Following benchmarks are working: Seasons:

The following benchmarks are not available at the moment due to this issue. Storms:

Climatology:

A fix for automatically suggesting SOURCE files to user is in the works in the dev_engage branch.

mfleduc commented 2 weeks ago

Thank you both very much! I will update when I have gotten everything to go correctly, although this probably won't be until Tuesday.