Closed ckoven closed 5 years ago
The relative error in the patches is definitely too large, its not a case of "the error checker is too strict". I am noticing that the crown areas are really massive, EDIT, I thought is was printing the per-plant cohort area... redacting this statement.
like some are about the size of the patch itself ~5k m2, for instance the last cohort printed. It is possible that with such massive crowns, that when we pare off the number density for promotion/demotion the math precision gets compromised...
@ckoven , I'm noticing also that there are 3 layers, each of which are roughly the size of the patch. I'm curious, did you bump up nclmax to 3?
Just discussed this with @rgknox. While the conditions here that are causing the crash are clearly unrealistic (trees that are gigantic), there may be something more relevant to general conditions here in what seems to be happening. What I think is happening is that in the midst of the canopy structure loop there is a cohort fusion step. Because crown area is not conserved during cohort fusion, this then leads to total crown areas not being what was anticipated from the promotion/demotion. There is an iteration loop to allow this to be sorted out, but in the case of these big trees (possibly only relevant when trees in canopy layers 2 or 3 are large), it has to iterate several times and so is running into the max iteration logic.
So the key question is whether the lack of crown area conservation is important here more generally. Now, when we fuse cohorts, we conserve biomass (important!) and take a weighted average of DBH. Because crown area is DBH to some power, we can't conserve crown area. Because biomass is DBH to some (larger) power, we also don't conserve the relationship between DBH and biomass, but the flexible allometry scheme allows us to temporarily go off allometry and then grown back towards allometry on subsequent timesteps. A possible approach to this problem would be to, instead of making the fused cohort's DBH a population-weighted average of the fusing cohorts' DBH, to instead conserve the total crown area and then calculate the DBH of the fused cohort as that which conserves crown area during fusion. This should avoid the iteration loop and give us exactly the crown areas that we seek.
The short term fix that I'm exploring for this particular case (which, again, is unrealistic) is to leave the logic as-is and instead just increase the number of iterations permitted. But we may want to revisit the question of whether or not we should try to conserve crown area during cohort fusion in general.
Another thing to consider is that we force the upper layers to be "perfectly" plastic. Ideally, we would have a canopy that is imperfect, which allows for some gap space between crowns, and a radiation transfer that allows for some bypass of radiation through each layer.
If we code this in, then we may be able to get away with less precision? ... because the promotion/demotion error will be subsumed by the open gap area... thinking extemporaneously here
just to say that increasing the number of iterations from 10 to 20 doesn't solve this, i still get a crash in the same place. log file pasted below, the differences start at the boundary between the first and second canopy layers:
clm: completed timestep 433439
clm: completed timestep 433440
clm: calling FATES model 433441
FATES Dynamics: 26-02-26
PATCH AREA CHECK NOT CLOSING
patch area: 5298.27184704583
layer: 1 area: 5297.50091423565
rel error: -1.455064655875225E-004
layer: 2 area: 5298.27193345459
rel error: 1.630885784912938E-008
layer: 3 area: 5282.08435213392
rel error: -3.055240534881608E-003
lat: 9.15300000000000
lon: 280.153900000000
spread: 0.000000000000000E+000
coh ilayer: 1
coh dbh: 1476.16889080131
coh pft: 1
coh n: 2.233505020155443E-003
coh carea: 78.5612143846136
maxh: 99999.0000000000
lmode: 3.00000000000000
d2bl2: 1.53530120849609
d2bl_ediff: 0.000000000000000E+000
d2ca_min: 0.479339092969894
d2ca_max: 0.479339092969894
coh ilayer: 1
coh dbh: 1358.23389256322
coh pft: 1
coh n: 1.340103012093285E-002
coh carea: 414.801832610886
maxh: 99999.0000000000
lmode: 3.00000000000000
d2bl2: 1.53530120849609
d2bl_ediff: 0.000000000000000E+000
d2ca_min: 0.479339092969894
d2ca_max: 0.479339092969894
coh ilayer: 1
coh dbh: 1272.06800083262
coh pft: 1
coh n: 3.126907028217620E-002
coh carea: 875.218072143910
maxh: 99999.0000000000
lmode: 3.00000000000000
d2bl2: 1.53530120849609
d2bl_ediff: 0.000000000000000E+000
d2ca_min: 0.479339092969894
d2ca_max: 0.479339092969894
coh ilayer: 1
coh dbh: 1179.37472898808
coh pft: 1
coh n: 6.030463554419742E-002
coh carea: 1502.81075570403
maxh: 99999.0000000000
lmode: 3.00000000000000
d2bl2: 1.53530120849609
d2bl_ediff: 0.000000000000000E+000
d2ca_min: 0.479339092969894
d2ca_max: 0.479339092969894
coh ilayer: 1
coh dbh: 1078.52925814784
coh pft: 1
coh n: 0.111675251007774
coh carea: 2426.10903939221
maxh: 99999.0000000000
lmode: 3.00000000000000
d2bl2: 1.53530120849609
d2bl_ediff: 0.000000000000000E+000
d2ca_min: 0.479339092969894
d2ca_max: 0.479339092969894
coh ilayer: 2
coh dbh: 968.673234548291
coh pft: 1
coh n: 0.170290134074354
coh carea: 3136.99592449363
maxh: 99999.0000000000
lmode: 3.00000000000000
d2bl2: 1.53530120849609
d2bl_ediff: 0.000000000000000E+000
d2ca_min: 0.479339092969894
d2ca_max: 0.479339092969894
coh ilayer: 2
coh dbh: 871.269606347760
coh pft: 1
coh n: 0.138053556046577
coh carea: 2161.27600896097
maxh: 99999.0000000000
lmode: 3.00000000000000
d2bl2: 1.53530120849609
d2bl_ediff: 0.000000000000000E+000
d2ca_min: 0.479339092969894
d2ca_max: 0.479339092969894
coh ilayer: 3
coh dbh: 856.240810343689
coh pft: 1
coh n: 7.853584316935310E-002
coh carea: 1197.09557582510
maxh: 99999.0000000000
lmode: 3.00000000000000
d2bl2: 1.53530120849609
d2bl_ediff: 0.000000000000000E+000
d2ca_min: 0.479339092969894
d2ca_max: 0.479339092969894
coh ilayer: 3
coh dbh: 753.528330790387
coh pft: 1
coh n: 0.255160873131248
coh carea: 3196.47564453217
maxh: 99999.0000000000
lmode: 3.00000000000000
d2bl2: 1.53530120849609
d2bl_ediff: 0.000000000000000E+000
d2ca_min: 0.479339092969894
d2ca_max: 0.479339092969894
coh ilayer: 3
coh dbh: 667.421179856545
coh pft: 1
coh n: 8.545083124641673E-002
coh carea: 888.513131776645
maxh: 99999.0000000000
lmode: 3.00000000000000
d2bl2: 1.53530120849609
d2bl_ediff: 0.000000000000000E+000
d2ca_min: 0.479339092969894
d2ca_max: 0.479339092969894
ENDRUN:
ERROR in EDCanopyStructureMod.F90 at line 267
just to say that i tried reworking the cohort fusion to conserve crown area instead of total dbh (212a056) and so far it least it fixes this problem. so we may want to discuss further whether that should be the general solution.
I've no objection in principle to using crown area, particularly given it's a more physiologically relevant property, and that neither option is 'realistic' anyway.
Le sam. 17 nov. 2018 à 21:17, Charlie Koven notifications@github.com a écrit :
just to say that i tried reworking the cohort fusion to conserve crown area instead of total dbh (212a056 https://github.com/NGEET/fates/commit/212a0564c826d7ab0519e38f4a045e13a44842fc) and so far it least it fixes this problem. so we may want to discuss further whether that should be the general solution.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/NGEET/fates/issues/442#issuecomment-439644753, or mute the thread https://github.com/notifications/unsubscribe-auth/AMWsQ85_nLVX3zZABgWwX0KJnW-SHJ6Mks5uwG7egaJpZM4YYrwf .
Dr Rosie A. Fisher
Staff Scientist Terrestrial Sciences Section Climate and Global Dynamics National Center for Atmospheric Research 1850 Table Mesa Drive Boulder, Colorado, 80305, USA
and
Visitor @ C.E.R.F.A.C.S Centre Européen de Recherche et de Formation Avancée en Calcul Scientifique 42 Avenue Gaspard Coriolis 31057, Toulouse, France http://www.cgd.ucar.edu/staff/rfisher/
In some of the ensembles I've been running, competing two PFTs at the BCI testbed, and initialized from inventory, I'm seeing a consistent error from a couple of the ensemble members. This happens both with and without the changes in #441.
I'm not sure what's going here. it seems to happen with two of the ensemble members. both ensemble memebrs aer generating unrealistically large trees, which may be what's happening. possibly something like the error in conservation of crown area when fusing trees that are large means that more iterations are required to shuffle the canopy around. so possibly its as simple as increasing the allowable number of iterations.
here's the relevant part of the cesm log file of one of the ensemble members that crashes:
The lnd log from the first ensemble member that crashes ends with this:
here's the lnd log from the second ensemble member that crashes: