EDmodel / ED2

Ecosystem Demography Model
78 stars 112 forks source link

Budget check has failed when a planting event #332

Open koxingazhu opened 3 years ago

koxingazhu commented 3 years ago

The ED2 running crashed and the error messages showed "Budget check has failed", "Sub-daily budget failed" and "CARBON_FINE: F". I have tried to adjust NL%RK4_TOLERANCE from 0.01 to 0.0001 (because the error messages mentioned RK4 integrator), but it does not work. Anyone have experienced the crash and how to solve it? Thank you!

......

:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::


                 !!! FATAL ERROR !!!                      

---> File:        budget_utils.f90
---> Subroutine:  compute_budget
---> Reason:      Budget check has failed, see message above!

ED execution halts (see previous error message)...

koxingazhu commented 3 years ago

I am sorry for the poor formatting in the issure, and I do not know how to fix formatting errors. The year is 2023 (2023), doy is 121 (121), event is planting (......), pft=2 (2), density=1.000 (1.000).

mdietze commented 3 years ago

I don't have time to debug, but I'll highlight a bit of text in the error message above:

Budget failure doesn't necessarily mean a problem in the RK4 or Euler integrators. If you see NaN in any variable above, and the simulation time is the first day of the month near 00UTC, then it is very likely that some variable has not been properly initialised in the patch or cohort dynamics (e.g. new recruit or new patch). The best way to spot the error is to compile the model with strict debugging options and checks.

If the crash is occurring during Planting then it is more likely to be an initialization issue than an integration issue. Even if it's not a completely uninitialized value (NaN), it could be that something is being set-up for the new cohort in a way that is unrealistic or physically implausible that then causes the integrator to go nuts. You'd want to take a closer look at what's triggering the fail to try and figure out what the issue would be.

Also, as an aside, I see that your carbon budget numbers are non-zero for storage respiration -- I strongly advise against ever turning storage respiration on as it's not something that has a physiological basis.

koxingazhu commented 2 years ago

@mdietze Thank you for your suggestions. I have set NL%STORAGE_RESP_SCHEME = 1 (ED-2.2 default, Storage respiration is calculated as in 0......) follow your advice if I understand you correctly and debugged the code. I have found that the model crashed when running on OMP parallel ( !$OMP PARALLEL DO DEFAULT(SHARED) PRIVATE( & in ) within rk4_driver::rk4_timestep function. Dose OpenMP trigger the fail?

I have also noticed the disturbance::plant_patch function when debugging the code and tried to find the initialization issue, but dbh, hgt, agb of new cohort seemed look normal.

When I set Planting PFT is 2/3/4/7/10/11, ED all crashed. However, ED can run successfully only if I set Planting PFT is 9 - Cold-deciduous early hardwood.

Debug info: ...... (gdb) 336 rk4site%atm_rhv = rehuil8(rk4site%atm_prss,rk4site%atm_tmp,rk4site%atm_shv,.true.) (gdb) 337 rk4site%atm_rhos = idealdenssh8(rk4site%atm_prss,rk4site%atm_tmp,rk4site%atm_shv) (gdb) 341 return (gdb) 342 end subroutine copy_met_2_rk4site (gdb) rk4_driver::rk4_timestep (cgrid=<error reading variable: value requires 68056 bytes, which is more than max-value-size>) at rk4_driver.F90:137 137 !$OMP PARALLEL DO DEFAULT(SHARED) PRIVATE( & (gdb) -------------------------------------------------- !!! Sub-daily budget failed !!!

TIME : 2004 07 06 0. IPA : 1 DIST_TYPE : 3 N_LEAF_RESOLVABLE : 17 N_WOOD_RESOLVABLE : 15 NLEV_SFCWATER : 0

AGE : 8.3333336E-02 LAI : 3.1097009E+00 WAI : 9.5453674E-01 VEG_HEIGHT : 9.9338026E+00 ......