NGEET / fates

repository for the Functionally Assembled Terrestrial Ecosystem Simulator (FATES)
Other
105 stars 92 forks source link

Unexplained loss of plants following logging events #714

Open JoshuaRady opened 3 years ago

JoshuaRady commented 3 years ago

Issue Summary:

I have found that under typical conditions logging events cause a subsequent wave of 'mortality' that can be larger than that created by the logging event itself. This behavior occurs with different PFT compostions but only when grasses are present. The cause of this issue has yet to be determined. I am seeking community insight into the cause of this issue.

Detailed Explanation:

I found this problem when I tried to confirm that the logging module was harvesting the amount expected based on the logging parameters. To simplify things I set all the side-effect mortalities (collateral etc.) to 0 and set the logging module to harvest all size classes (see parameters below). In this way only direct mortality occurs at the rate specified. In this way the expected harvest can be easily confirmed from the change in the biomass across the harvest date. When I did this it initially appeared the logging module was harvesting too much (Fig. 1b). I went through the logging code and added extensive reporting code and found that the harvesting code was working as expected.

When I switched to daily history files and zoomed in on the harvest date I discovered that the harvest event does in fact remove the correct fraction of plants and biomass. However, after a few days of relative stability the plant biomass (and numbers) begin a phase of exponential decrease until a new semi-equilibrium is reached in a couple weeks time (Fig. 1c). I use basal area for subsequent plots because it only shows the relevant woody PFTs (Fig. 2)

This happens at different harvest amounts (fates_logging_direct_frac), different dates (fates_logging_event_code), and with different PFT compositions. I does not occur if grasses are not present (Fig. 3).

The loss of plants occurs across size ranges and all woody PFTs present, although the PFTs at lower abundances are harder to interpret. Grasses decline after a brief spike.

Mysteriously the loss of these plants does not clearly show up in the mortality diagnostics (Fig. 4).

Figures

Experimental Conditions:

Initial settings were taken from the example simulations used in past trainings (default-ish?).

Computing Environment:

Simulations were run on Cheyenne with the Intel compiler.

Compset and Settings:

I use the 1x1_brazil resolution and performed a 40 year mini-spin-up from bareground, just to get trees of reasonable sizes for harvest, not to get to any equilibrium state (Fig. 1a). A restart simulation was run for 2-10 years with a logging event (Fig 1b). The core test scenario was a 20% logging event (see parameters) in 1942, although other dates (1945) and harvest fractions (50%) were tested and showed the same behavior.

FATES Branch:

As mentioned above initial testing was performed on one of my development branches before moving to Disturbance_Check_Fix. This branch is almost identical to main/master but includes a trivial fix for a separate bug in the logging code (see issue #709). The problem behavior was replicated on Disturbance_Check_Fix at:

Parameters:

Pure PPA (fates_mort_disturb_frac = 0) was used for all experiments to make simulations run faster and to simplify the interpretation of the output.

In subsequent experiments, unrelated to this issue, @jkshuman and I found that under a variety of conditions (all conditions tested so far) logging events will throw a mass balence error with values of fates_mort_disturb_frac other than 1.0. This issue is being documented and will be posted separately.

All indirect logging mortalities were set to 0.

Example logging code parameter settings:

PFTs:

I repeated the above procedure for the following PFT compositions:

Note: I did much of this investigating and debugging on one of my highly derived braches but I have confirmed that I get identical results to the master branch for all of the claims made here.

Reproducible Example:

#Paths:-------------------------------------------
#Customize to your environment.

#Path to the FATES parameter file for this version of FATES:
TemplateParamPath='/glade/p/cesmdata/cseg/inputdata/lnd/clm2/paramdata/fates_params_api.14.0.0_12pft_c200921.nc'

ParamOutputDir='/glade/work/jmrady/InputFiles/ParameterFiles/FATES/Proj_7_Exp_55/'
NewParamFile='Cut20pct1942_CCM_Off_12pft_c200921_B.nc'
NewParamPath=$ParamOutputDir$NewParamFile

#My clone of my Disturbance_Check_Fix branch:
#Get it by:
# $ git clone --branch fates_main_api https://github.com/ESCOMP/ctsm.git JMR_Disturbance_Check_Fix_2
#At ctsm1.0.dev105-136-g7138ab0d
#
#In Externals_CLM.cfg change the FATES repo URL to https://github.com/JoshuaRady/fates.
# $ ./manage_externals/checkout_externals
# $ cd src/fates/
# $ git checkout Disturbance_Check_Fix
#At 1ed0a3a0

FatesDisturbanceCheckFixPath='/glade/work/jmrady/ClmVersions/JMR_Disturbance_Check_Fix_2'

#Put your user name here:
UserName=username

#Put your project number here:
ProjectNum=1234567

ProjectDir='/glade/u/home/jmrady/Proj_7_Exp_55_Cases/LoggingIssueExample/'
MiniSpinUpCase=Brazil40yrSpinUp_All_PFTs
MiniSpinUpPath=$ProjectDir$MiniSpinUpCase
LoggingCase=DCF_Logging_1942_20pct_All_PFTs_CCM_Off

#Prepare the parameter file:----------------------
#Note: I don't routinely use ./modify_fates_paramfile.py so there may be a more compact way to do the following.

cd $ProjectDir
cp $FatesDisturbanceCheckFixPath/src/fates/tools/modify_fates_paramfile.py .

module load python/2.7.13 ncarenv/1.2 intel/17.0.1 mkl/2017.0.1 ncarcompilers/0.4.1 numpy/1.13.3 netcdf4-python/1.2.7

./modify_fates_paramfile.py --var fates_mort_disturb_frac --val 0.0 --fin $TemplateParamPath --fout $NewParamPath
./modify_fates_paramfile.py --var fates_logging_coll_under_frac --val 0.0 --fin $NewParamPath --fout $NewParamPath --overwrite
./modify_fates_paramfile.py --var fates_logging_collateral_frac --val 0.0 --fin $NewParamPath --fout $NewParamPath --overwrite

#fates_logging_dbhmax is already set to the correct value.

./modify_fates_paramfile.py --var fates_logging_dbhmax_infra --val 0.0 --fin $NewParamPath --fout $NewParamPath --overwrite
./modify_fates_paramfile.py --var fates_logging_dbhmin --val 0.0 --fin $NewParamPath --fout $NewParamPath --overwrite
./modify_fates_paramfile.py --var fates_logging_direct_frac --val 0.2 --fin $NewParamPath --fout $NewParamPath --overwrite
./modify_fates_paramfile.py --var fates_logging_event_code --val 19420101.0 --fin $NewParamPath --fout $NewParamPath --overwrite
./modify_fates_paramfile.py --var fates_logging_mechanical_frac --val 0.0 --fin $NewParamPath --fout $NewParamPath --overwrite

#Set up the mini-spin-up simulation:--------------

cd $ProjectDir
$FatesDisturbanceCheckFixPath/cime/scripts/create_newcase --case $MiniSpinUpCase --compset I2000Clm50FatesGs --mach cheyenne --res 1x1_brazil --project $ProjectNum --walltime 6:00:00 -q economy -v --run-unsupported
cd $MiniSpinUpCase

#Change settings:
./xmlchange STOP_OPTION=nyears,STOP_N=40,REST_N=40
./xmlchange DATM_CLMNCEP_YR_START=1996
./xmlchange DATM_CLMNCEP_YR_END=2001
./xmlchange RUN_STARTDATE=1901-01-01
./xmlchange CLM_FORCE_COLDSTART=on

#Set up to create the namelists:
./case.setup

#Add settings to user_nl_clm:
cat <<EOF >> user_nl_clm

fates_paramfile = '$NewParamPath'
hist_fincl1 = 'BA_SCPF', 'NPLANT_SCPF', 'TRIMMING_CANOPY_SCLS', 'TRIMMING_UNDERSTORY_SCLS', 'CROWN_AREA_CANOPY_SCLS', 'CROWN_AREA_UNDERSTORY_SCLS', 'NPLANT_CANOPY_SCLS', 'NPLANT_UNDERSTORY_SCLS', 'NPLANT_CANOPY_SCPF', 'NPLANT_UNDERSTORY_SCPF'
hist_mfilt = 1

EOF

#Build:
./case.setup --reset
qcmd -- ./case.build

./case.submit

#Set up the logging simulation:-------------------
#The only reason in this case that we do a restart is so we don't have to run 50 years with daily output.

cd $ProjectDir
$FatesDisturbanceCheckFixPath/cime/scripts/create_clone --clone $MiniSpinUpPath --case $LoggingCase --verbose
cd $LoggingCase

#Change settings:
./xmlchange STOP_OPTION=nyears,STOP_N=11,REST_N=11
./xmlchange DATM_CLMNCEP_YR_START=1996
./xmlchange DATM_CLMNCEP_YR_END=2001
./xmlchange RUN_STARTDATE=1941-01-01
./xmlchange CLM_FORCE_COLDSTART=off
./xmlchange JOB_WALLCLOCK_TIME=2:00:00

#In user_nl_clm add:
cat <<EOF >> user_nl_clm

finidat = '/glade/scratch/$UserName/$MiniSpinUpCase/run/$MiniSpinUpCase.clm2.r.1941-01-01-00000.nc'
hist_nhtfrq = -24
use_fates_logging = .true.

EOF

#Build:
./case.setup --reset
qcmd -- ./case.build

./case.submit

Potential Causes:

I can only conjecture as to the cause of this problem. My reading of the code and my test simulations have not revealed anything with certainty. The delay after harvest suggests to me that a FATES is iteratively calculating something over multiple time steps before an action threshold is reached.

Recalculation of the canopy spread following a large reorganization of the canopy seemed possible but the output doesn't seem to support this.

I also looked at cohort fusion. I thought that increased growth of understory plants could cause crowding of the cohort slots and force fusing of trees. This is not supported by output. The number of cohorts does not reach the maximum number (100) during the period when the issue occurs (Fig. 5). I went on to increase the maximum number of cohorts by changing the variable maxCohortsPerPatch in EDTypesMod.F90 to 250. The results of this experiment was hard to interpret because of large differences in the simulation trajectories with this change. The decline was delayed a bit but it still occurred (data not shown).

ckoven commented 3 years ago

Hi @JoshuaRady have you had any success in tracking down what was causing this?

JoshuaRady commented 3 years ago

I have not made any progress on this issue. I had run out of ideas of what to look at next when I submitted this issue and I haven't come up with anything new in the intervening time. I have been running my recent experiments without grasses but I will need to include them soon so any ideas as to what to look at next would be appreciated.

JoshuaRady commented 2 years ago

Issue Update Summary:

For a while I avoided this problem by omitting grasses. My recent simulations saw it return in a novel way and provided insights that allowed me to identify the root cause. I have confirmed my earlier suspicion that the canopy spread factor is related to this 'mortality crisis' behavior. I have confirmed that the issue is still present with the latest version of FATES.

New Information:

In my recent simulations I have had multiple populated patches of different ages and started to observe that when plants (trees) were removed from one patch there would be a brief but significant loss of plants from other patches. As in the earlier manifestation described above (with one patch) there is no mortality recorded in the history file output to explain this loss of plants.

This was mysterious but also eliminated some possible explanations. I quickly focused on the spread factor since it is calculated at the site level and could therefore be a route for interpatch interaction. I saw that in cases where removals did not cause a crisis the spread factor was stable and in cases where a crisis occurred the spread factor changed rapidly.

Mechanism:

Based on further experiments and examination of the code this is what I have determined is happening:

To confirmed this I added a bit of code to EDCanopyStructureMod DemoteFromLayer() that reports the number of plants terminated to the log (see my Issue714 branch). In simulations where harvests did not result in a mortality crisis this form of demotion termination mortality only occurred a few times during the run, if at all, and only involved a tiny number of plants. In simulations where the criss occurred I see this demotion termination mortality occurring tens of thousands of times. When I sum up the number of plants reported they closely match the number that disappear during the crisis (Fig. 6A).

I repeated the example simulation above updated to the latest version of CTSM (57175712e) and FATES (b2714927) and confirmed the behavior remains. Figure 6 is from a simulation that only adds the reporting lines mentioned above.

Role of Grasses:

Unlike the single-patch scenario originally described, the multi-patch scenario occurs without grasses present. Looking at those earlier simulations it is clear that without grass the understory layer was not full and the spread factor did not change following harvest. It was not grass per-se but the crown area that determined whether the crisis occurred.

Figures:

Proj_7_Exp_69 Issue714 Update Fig  6a

Proj_7_Exp_69 Issue714 Update Fig  6b The spread factor rises rapidly.

Proj_7_Exp_69 Issue714 Update Fig  6c Plants move from the lower to upper layer after harvest. Some are quickly pushed back into the lower layer as crowns expand. Then demotion termination commences.

Proj_7_Exp_69 Issue714 Update Fig  6d The dip in crown area consider with the delay between the harvest and the onset of the mortality crisis. The crisis ends when the crown area drops below 2 again.

Significance:

I think this a potentially significant bug. It will certainly bias the results for any logging event that occurs when a full double layer canopy is present, a condition that is not rare. But I think that the issue is larger than that. The logging module is only the implement of the disturbance here and is not the cause of the crisis itself. I suspect that any large disturbance, regardless of cause, could cause such a mortality crisis. I suspect that this may be affecting some people's simulations right now without being detected since the behavior is easy to miss with monthly output and is not reported clearly in the history output. I recommend looking at some SPITFIRE runs where the canopy is full prior to a fire.

I find the multi-patch behavior to be particularly troubling. Having plants die in one patch in response to mortality in another is inconsistent with any patch interpretation I can think of, but I may be missing something. If nothing else it is breaking some of my simulations.

Potential Solutions

This issue has two parts: the mortality crisis itself and the fact that the mortality is not reported.

At the very least I think that this mortality should be recorded either as part of termination mortality (FATES_MORTALITY_TERMINATION_SZ) or by itself as a new variable (if people think it should be distinguished from termination due to cohorts getting very small). In simulations without major mortality events it appears demotion termination occurs somewhat rarely and removes very small numbers of plants. Given this perhaps it was design decision to omit this from reporting and if not I can see how it was overlooked. With the Selective Logging Module (and the Vegetation Management Module) this mortality can be quite large so people need to be able to see that is is happening.

I'm not sure what the best solution for the mortality crisis behavior itself. The site level calculation of the spread factor is something that has concerned others folks as well (see Issue #777). I think a patch level spread calculation would solve this issue for the multi-patch case but what to do about the single-patch case? Apparently patch level spread calculation caused other problems (PR #279), complicating matters.

I have been working with a modified version of spread factor calculation for a while that improves some aspects of canopy closure behavior. It deceases the extent of the mortality crisis but does not eliminate it.

I think a larger discussion is warranted. There are other issues with canopy behavior that are problematic so perhaps a more wholistic approach is warranted.

jestenzel commented 2 years ago

@JoshuaRady I can confirm that other large disturbance events AND large cumulative disturbance (cstarvation) also cause the mortality crisis in stands where d2ca_max and min are not equal.

Making spread a patch level variable pushes the spread problem to the surviving trees of new patches (Issue 274), though I'm guessing an increase in their canopy area is unlikely to cause demotion mortality.

I think the issue is ultimately that cohort crown area should not be perfectly plastic w/ site or patch spread values (ex) and would more realistically be gradually incremented on by altered patch spread.

ckoven commented 2 years ago

Hi @JoshuaRady @jestenzel thanks for following up on this. Based on your description, it seems like we definitely need to add the mortality that is occurring in SendCohortToLitter in https://github.com/NGEET/fates/blob/master/biogeochem/EDCanopyStructureMod.F90#L724-L731 to the termination mortality diagnostics. The intention should be to handle all demotion-related mortality identically, so that does look to be a bug that it isn't doing that at present.

For the specific problem, it sounds like potential solutions might include (a) not using the spread logic at all; i.e. setting the fates_allom_d2ca_coefficient_min and fates_allom_d2ca_coefficient_max values equal to each other, or (b) adding logic to slow down the rate of change in the spread term, e.g., by saying it can only change the spread value if its been some number of days since the last time it changed the spread value. That way we can keep aspects of the spatial plasticity but hopefully remove some of the temporal dynamics that lead to this and other strange behavior. Do you think either of those solutions would work for the specific science problems you are working on?

walkeranthonyp commented 2 years ago

Interesting! Funnily enough I just made a pull request to the documentation that updates the description of the spread factor to be consistent with the code (i.e. site level and top canopy layer).

I'll take more of a read of the issues you cite @JoshuaRady.

Agree this warrants more discussion and could be the subject of a fates call. Finer grained calculation of spread could be an option. But there are alternative algorithms that we could also consider. Canopy area and so canopy spread is so crucial to the canopy layering logic.

Two quick thoughts similar to Carlie's:

1) The daily change in the spread factor is currently set to 0.05, and this sets the change in canopy spread (Sc) as a proportion of the difference between Sc,min and Sc,max. That means the time it takes for Sc to go from Sc,max to Sc,min can be as little as 20 days. That's a huge rate of change for a plant canopy. I've been experimenting with reducing that down to 0.001 or smaller even. Suggest trying that to stabilise the rate of change.

2) The max and min parameters are key. Especially the min (as that's the value of Sc for a closed canopy) and my guess is that most literature values for the normalisation constant in a canopy area to dbh power-law relationship is most likely from a closed canopy. So lit values of the normalisation constant are prob best used to inform fates_allom_d2ca_coefficient_min.

Message ID: @.***>

JoshuaRady commented 2 years ago

@ckoven, you are likely right that setting the fates_allom_d2ca_coefficient_min and fates_allom_d2ca_coefficient_max parameters to the same value would eliminate the behavior. I am reluctant to do this because modeling the crown plasticity has been important to matching observations in my simulations. I'd have to do some experiments to know how much difference it makes.

Slowing down the rate at which crown widths can respond to available space in the canopy is something I agree with in general. In disturbance events that do not result in a mortality crisis the canopy still closes too fast in my comparisons. There are physiological constraints on how quickly branches can grow and foliage can be reorganized.

However, I don't think slowing down the rate of change in the spread factor would solve this problem for the multi-patch scenario. It would slow the demotion process but the change in the spread factor would still continue to drive cohorts out if the surviving patch's canopy layers are full.

In the original one-patch variant it might help. Frankly, I have less of a handle on the dynamic in this case. The grasses complicate things. I would hope that slowing things down would replace demotion termination with light completion carbon starvation mortality. I'm just not sure.

jestenzel commented 2 years ago

@ckoven Thanks for the response.

Similarly, my goal w/ offset d2ca_min and d2ca_max is to simulate observed-site canopy closure/demotion patterns, with implications for patch NPP. Specifically, I'd like to model a period of time (or dbh) between which canopies can reach the closure threshold (min closure density) and when demotion begins to occur (max overstory density). Mimicking this pattern appeared important to simulating observed tree NPP trends by preventing tree-unoccupied canopy space for a number of years prior to competitive demotion.

Any reduction in the behavior that Joshua highlighted above would be much appreciated. I do agree with Joshua, though,. that the proposed change would simply alter the rate at which trees in an undisturbed patch would be prompted to expand their canopies by open space that they cannot occupy -> experience overstory demotion.

ckoven commented 2 years ago

Thanks @JoshuaRady @jestenzel. I wonder if a combination of (a) slowing down the rate of crown area plasticity with (b) increasing the number of canopy layers would work for your problem. In principle, demotion termination should never happen if the number of allowed canopy layers is much greater than the number of canopy layers in which plants can survive via shade tolerance, because plants would die before they ever got demoted from the lowest canopy level. How many allowed levels are you using, the standard 2? If so, you can increase this to 3 or more here: https://github.com/NGEET/fates/blob/master/main/EDTypesMod.F90#L36

rgknox commented 2 years ago

Regarding the question earlier about SendToLitter. I think I woud like to see a new routine, that performs "terminate_cohort", and is embedded in the "terminate_cohorts" routine that already exists. The terminate_cohort routine would perform all of the same functions as "terminate_cohorts" but just be for one cohort. We would use the terminate_cohort() routine in place of the call to SendCohortToLitter(), here: https://github.com/NGEET/fates/blob/master/biogeochem/EDCanopyStructureMod.F90#L726

EDIT: Further, we should not need to set cohort%n to 0 prior to the new call, that should be handled automatically with a terminate_cohort() call.

JoshuaRady commented 2 years ago

As mentioned in the discussion of this issue in todays FATES Call I developed an altered version of EDCanopyStructureMod canopy_spread() for my Loblolly PFT project. It does not solve this issue but seems to reduce the extent somewhat (see above). That is probably because it slows the change in the spread factor. The main purpose of this modification was to eliminate the 'shelf' in the development crown area of a developing single age stand. I put the modified routine on my CreamCheese branch. The only other difference from master on this branch is the demotion termination reporting code mentioned above. Perhaps this will be useful.