alphaparrot / ExoPlaSim

Exoplanet Planet Simulator (PlaSim extended for different planet types (including tidally-locked) and evolution on geological timescales--glaciers and carbon cycle)
GNU General Public License v2.0
62 stars 12 forks source link

Crashing when using T42 #23

Open dogevspenguin opened 4 months ago

dogevspenguin commented 4 months ago

So after I ran T21, I decided that it was too blurry So I made the .sra file again, this time for T42, And after running for about 1 year, It crashes. I generated the height map in BMP format using Torben's planet map generator, Then Converted bmp to png, then using https://github.com/OstimeusAlex/ExoPlaSim-InCon I was able to generate the .py file and the .sra files

The code

import exoplasim as exo
T = exo.Model(workdir="T",modelname="T",inityear=0,outputtype=".nc",ncpus=16,precision=8,resolution="T42",layers=7)
T.configure(startemp=5772.0,flux=1368.0,
                  year=280.191,eccentricity=0.02,obliquity=25.0,lonvernaleq=100.0,fixedorbit=True,
                  rotationperiod=0.9047,
                  gravity=10.738,radius=1.047,
                  wetsoil=False,seaice=True,oceanzenith="ECHAM-3",
                  landmap="SRA/T_surf_0172.sra",topomap="SRA/T_surf_0129.sra",
                  pressure=1.0,gascon=287.0,drycore=False,ozone=False,
                  pH2=0.0,pHe=0.0,pN2=0.7809,pO2=0.2095,pAr=0.0093,pNe=0.0,pKr=0.0,pH2O=0.0,pCO2=0.0,
                  glaciers={'toggle': True,'mindepth': 2.0,'initialh': 0.0},
                  timestep=45.0,runsteps=8966,otherargs={'NSTPW@plasim_namelist':'160'})
T.exportcfg()
T.run(years=30,crashifbroken=False)
T.finalize("T-o",allyears=False,keeprestarts=False)

And the error

[...] Caught signal 8 (Floating point exception: floating-point overflow)
[...] Caught signal 8 (Floating point exception: floating-point overflow)
==== backtrace (tid:   7465) ====
 0 0x0000000000016910 __funlockfile()  ???:0
 1 0x000000000007f4c5 __ieee754_exp_fma()  ???:0
 2 0x000000000004781f __GI___exp()  ???:0
 3 0x00000000004108a6 mklsp_()  ???:0
 4 0x000000000041e40d rainstep_()  ???:0
 5 0x0000000000449ec7 gridpointd_()  ???:0
 6 0x0000000000451d19 master_()  ???:0
 7 0x00000000004022c4 main()  ???:0
 8 0x000000000003524d __libc_start_main()  ???:0
 9 0x000000000040232a _start()  /home/abuild/rpmbuild/BUILD/glibc-2.31/csu/../sysdeps/x86_64/start.S:120
=================================

Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.

Backtrace for this error:
==== backtrace (tid:   7474) ====
 0 0x0000000000016910 __funlockfile()  ???:0
 1 0x000000000007a4b0 xflow()  ???:0
 2 0x000000000004781f __GI___exp()  ???:0
 3 0x0000000000418fb4 kuo_()  ???:0
 4 0x000000000041e475 rainstep_()  ???:0
 5 0x0000000000449ec7 gridpointd_()  ???:0
 6 0x0000000000451d19 master_()  ???:0
 7 0x00000000004022c4 main()  ???:0
 8 0x000000000003524d __libc_start_main()  ???:0
 9 0x000000000040232a _start()  /home/abuild/rpmbuild/BUILD/glibc-2.31/csu/../sysdeps/x86_64/start.S:120
=================================

Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.

Backtrace for this error:
#0  0x7f4113449640 in ???
#1  0x7f4113448873 in ???
#2  0x7f411341e90f in ???
#3  0x7f41131334c5 in ???
#4  0x7f41130fb81e in ???
#5  0x4108a5 in ???
#6  0x41e40c in ???
#7  0x449ec6 in ???
#8  0x451d18 in ???
#9  0x4022c3 in ???
#10  0x7f411283e24c in ???
#11  0x402329 in _start
        at ../sysdeps/x86_64/start.S:120
#12  0xffffffffffffffff in ???
--------------------------------------------------------------------------
mpiexec noticed that process rank 11 with PID 7465 on node DESKTOP-O671JUH exited on signal 8 (Floating point exception).
--------------------------------------------------------------------------
Command '['mpiexec -np 16 most_plasim_t42_l7_p16.x']' returned non-zero exit status 136.
Traceback (most recent call last):
  File "/home/peera/.local/lib/python3.6/site-packages/exoplasim/__init__.py", line 930, in _run
    subprocess.run([self._exec+self.executable],shell=True,check=True)
  File "/usr/lib64/python3.6/subprocess.py", line 438, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['mpiexec -np 16 most_plasim_t42_l7_p16.x']' returned non-zero exit status 136.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "T.py", line 14, in <module>
    T.run(years=30,crashifbroken=False)
  File "/home/peera/.local/lib/python3.6/site-packages/exoplasim/__init__.py", line 495, in run
    self._run(**kwargs)
  File "/home/peera/.local/lib/python3.6/site-packages/exoplasim/__init__.py", line 1027, in _run
    self._crash() #Bring in the cleaners
  File "/home/peera/.local/lib/python3.6/site-packages/exoplasim/__init__.py", line 1869, in _crash
    raise RuntimeError("ExoPlaSim has crashed or begun producing garbage. All working files have been moved to %s_crashed/"%(os.getcwd()+"/"+self.modelname))
RuntimeError: ExoPlaSim has crashed or begun producing garbage. All working files have been moved to /home/peera/ExoPlaSim-InCon-master/T_crashed/
alphaparrot commented 4 months ago

45 minutes is almost always too long of a timestep for T42. Try dropping it to 15 minutes. Also, I don't recommend using fewer layers than the default 10; 10 layers has a predefined vertical tuning structure that tends to be well-behaved, while there is no prescribed hyperdiffusion for 7 layers, which will make the model more prone to numerical instability.

dogevspenguin commented 2 months ago

Hi thanks for the fix, but now, It seems that I cannot run with any storm climatologies, as it will crash, the error is the same, floating point error


import exoplasim as exo
T =exo.Model(workdir="T2",modelname="T2",inityear=0,outputtype=".nc",ncpus=16,precision=8,resolution="T42",crashtolerant=True,layers=10)
T.configure(startemp=5772.0,flux=1367.0,
                  year=365.25,eccentricity=0.016715,obliquity=23.441,lonvernaleq=102.7,fixedorbit=True,
                  rotationperiod=1.0,
                  gravity=9.80665,radius=1.0,
                  wetsoil=False,seaice=True,oceanzenith="ECHAM-3",
                  landmap="SRA/T_surf_0172.sra",topomap="SRA/T_surf_0129.sra",
                  pressure=1.0,gascon=287.0,drycore=False,ozone=False,
                  pH2=0.0,pHe=0.0,pN2=0.7809,pO2=0.2095,pAr=0.0093,pNe=0.0,pKr=0.0,pH2O=0.0,pCO2=0.0003,
                  glaciers={'toggle': True,'mindepth': 2.0,'initialh': -1.0},
                  timestep=15.0,runsteps=35064,otherargs={'NSTPW@plasim_namelist':'480'},stormcapture={'toggle': 1,'NKTRIGGER': 1},highcadence={'toggle': 1,'start': 320,'interval': 4,'end': 576},stormclim=True)
T.exportcfg()
T.runtobalance(threshold=0.0005,baseline=10,maxyears=100,minyears=10,crashifbroken=True,clean=True)
T.finalize("T2",allyears=False,keeprestarts=False,clean=True)