NGEET / fates

repository for the Functionally Assembled Terrestrial Ecosystem Simulator (FATES)
Other
95 stars 91 forks source link

Patch fusion error in fixed biogeography no comp mode with f19_f19 grid #940

Open JessicaNeedham opened 1 year ago

JessicaNeedham commented 1 year ago

Running FATES in fixed biogeography no comp mode with an f19_f19 grid I got the following error:

FATES is having difficulties fusing very small patches.
118: It is possible that a either a secondary or primary
118: patch has become the only patch of its kind, and it is
118: is very very small. You can test your luck by
118: disabling the endrun statement following this message.
118: FATES may or may not continue to operate within error
118: tolerances, but will generate another fail if it does not.
118: ENDRUN:
118: ERROR in EDPatchDynamicsMod.F90 at line 2872

The error occurs within the first hour of the simulation.

I don't see this error running on a f45_f45 grid.

I'm running on compy with ELM commit c63cce2 and FATES commit 8ef6a1e from main branch.

sshu88 commented 1 year ago

Hi @JessicaNeedham, I have faced the similar issue before. My problem was caused by a small harvest rate within certain PFT that only have one patch. FATES try to generate a secondary patch with small area but too small that FATES later attempt to terminate (<1e-4 right now I think). But FATES cannot remove the last patch of a certain PFT under certain tag (either primary or secondary) under nocomp mode. Other disturbances do not generate secondary patch thus the new patch can be merged even the area is tiny. In your case I'm wondering if your surface data has significantly small area of PFT. If you can print out and see the patch area it maybe can tell you more information?

sshu88 commented 1 year ago

@jenniferholm @glemieux @ckoven @rgknox Regarding the patch fusion, I actually found that with competition mode FATES allows the anthropogenic tag to be changed to force the patch fusion: https://github.com/sshu88/fates/blob/600f74b50e9115281af2f8e00ede1b9bf0362a3f/biogeochem/EDPatchDynamicsMod.F90#L2786

I'm not sure if we also need the same function under no competition mode. But at least my current version does not have.

JessicaNeedham commented 1 year ago

Hi @sshu88, thanks for your comment. I’m seeing this error in a run in which harvest is off, so it needs a fix beside allowing the anthropogenic tag to change - although maybe that is also necessary?

You might be right about some PFTs having a very small area. Would we want to allow for the termination of PFTs that start out with very very small areas in no comp mode? Should they exist to begin with? I’m still trying to wrap my head around the patch fusion logic…

glemieux commented 1 year ago

I'm attempting to replicate this with an f19_g17 resolution for clm-fates as well using no comp + fixed biogeography.

ckoven commented 1 year ago

@JessicaNeedham did you try testing as per the message by commenting out the endrun statement?

JessicaNeedham commented 1 year ago

If I comment out the endrun message it runs.

ckoven commented 1 year ago

Thanks @JessicaNeedham. I wonder then if the endrun here is being overly pessimistic? If we just have it give a warning and not crash, then that might still be sufficiently helpful for diagnosing a subsequent crash if one were to occur?

glemieux commented 1 year ago

I think that's a promising idea @ckoven, although I wonder what sort of crashes really small patches had been causing downstream to prompt the development of this check. @rgknox do you have a sense of this?

This is somewhat tangential, but I wonder how feasible it would be to store warning messages (or maybe codes?) so that we could inform the user of potential areas to investigate given a call to endrun.

glemieux commented 1 year ago

I'm attempting to replicate this with an f19_g17 resolution for clm-fates as well using no comp + fixed biogeography.

This ran without issue btw.

jenniferholm commented 1 year ago

Hi all,

Is the current solution to just remove the endrun here, and replace it with a warning?

On Tue, Nov 15, 2022 at 10:15 AM Gregory Lemieux @.***> wrote:

I'm attempting to replicate this with an f19_g17 resolution for clm-fates as well using no comp + fixed biogeography.

This ran without issue btw.

— Reply to this email directly, view it on GitHub https://github.com/NGEET/fates/issues/940#issuecomment-1315695575, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACCVHGAZEENEAURG75V6DDTWIPHNPANCNFSM6AAAAAASAJ73XI . You are receiving this because you were mentioned.Message ID: @.***>

-- Jennifer Holm Research Scientist Climate and Ecosystems Sciences Division Lawrence Berkeley National Laboratory 510-495-8083

ckoven commented 1 year ago

@jenniferholm yes, that is what I was suggesting.

glemieux commented 1 year ago

Per discussion during the software meeting today, the check was added via #501 to avoid "numerical havoc in demotion/promotion". @ckoven suspects that this check is outdated now given the recent categorical patch updates (e.g. primary/secondary forests).

rgknox commented 1 year ago

If/when we come up with a fix, can someone also address this line? https://github.com/NGEET/fates/blob/sci.1.60.2_api.24.2.0/biogeochem/EDPatchDynamicsMod.F90#L2855

The variable "youngerPatch% anthro_disturbance_label" has an unintended space inside the variable. (I didn't realize spaces are allowed, but it could confuse compilers?)

Re:

I think that's a promising idea @ckoven, although I wonder what sort of crashes really small patches had been causing downstream to prompt the development of this check. @rgknox do you have a sense of this? This is somewhat tangential, but I wonder how feasible it would be to store warning messages (or maybe codes?) so that we could inform the user of potential areas to investigate given a call to endrun.

I don't have a sense if preventing super small patch areas is still relevant. I suppose I would be on the look out for dividing by them.

As for the warning, I like that idea. We could keep the current warning message as a comment, and shorten the reported error message, and give it index 4:

use FatesGlobals     , only : FatesWarn

...
if(count_cycles > max_cycles) then
          ! FATES is having difficulties fusing very small patches.
          ! It is possible that a either a secondary or primary
          ! patch has become the only patch of its kind, and it is
          ! is very very small.  It is possible that this is causing
          ! numerical issues.  If the logs show a large number of index 4 errors
          ! We may want to re-evaluate how we cull small patches

          call FatesWarn('Bypassed patch fusion on small patch because of max_cycles',index=4)

          ! Note to user. If you DO decide to remove the end-run above this line
          ! Make sure that you keep the pointer below this line, or you will get
          ! an infinite loop.
          currentPatch => currentPatch%older
          count_cycles = 0
end if
jenniferholm commented 1 year ago

@sshu88 - do you think this fix will also correct the model crash you saw, due to the creation of very small secondary forest patches?

sshu88 commented 1 year ago

@jenniferholm For me there will be some further energy balance issues if too tiny patch is created. Current solution for me is to duplicate the code to force secondary patch to merge back with the parent primary patch as: https://github.com/sshu88/fates/blob/600f74b50e9115281af2f8e00ede1b9bf0362a3f/biogeochem/EDPatchDynamicsMod.F90#L2786 Thanks for checking!