try including TLP in the mixed effects model

teixeirak commented 5 years ago

@mcgregorian1, as a fairly high priority next week, please see if turgor loss point (TLP) is a meaningful predictor (as a fixed effect) of drought resistance in your model. Here are the TLP values:

sp	TLP
CAGL	-2.1282533
CAOV	-2.4839333
FAGR	-2.57164
FRAM	-2.1012133
JUNI	-2.75936
LITU	-1.9212933
PIST	NA
QUAL	-2.58412
QUPR	-2.3601733
QURU	-2.6395867
QUVE	-2.3879067
CACO	-2.1324133
CATO	-2.31424
FRNI	NA

mcgregorian1 commented 5 years ago

Hi @teixeirak

I added in the tlp values, and ran the same analysis. When tlp is included, then the best model to run is easily the full one where everything is combined; this confirms that tlp is a meaningful predictor of resistance values.
However, tlp for PIST is NA (It is as well for FRNI but that species is already excluded because of lack of core data). I decided to take out PIST and run the model comparisons again. When I do that, then suddenly I get three models that are roughly the same:
- model with only random effects (species and year) to predict resistance values (best)
- full model excluding tlp (slightly worse)
- full model with tlp (slightly worse) [this is the model from no1 above]

The question then is, if PIST has NA for tlp values, why would including it make such a significant indication that the best model uses tlp as a predictive factor? I'm not sure how to answer that.

mcgregorian1 commented 5 years ago

Results as of 11 March

Our dataframe includes the following elements:

Variable	Effect in linear mixed model
Year	Random
Species	Random (nested with tree #)
Canopy position	Fixed
Tree #	Random (nested under species)
Resistance value	Response

We compared different combinations of the above to determine which model is best; in other words, to determine which variables most significantly influence resistance value (as determined by the pointRes package in R). Using AICc as this indicator, we’re finding that a model containing all variables is only slightly better than a model containing only the random effects (and of that, species variability acts as a resistance value predictor much more than year variability). When we remove outliers, we find the opposite, where a full model is slightly worse than a model containing only species and year. This variation can be explained by the following density graph, where the subcanopy resistance values have more variability than the canopy values. In short, we’re finding there is almost no difference in explaining resistance value between canopy position and the species of a particular tree.

We also tried bringing in another fixed effect, which was a turgor loss point (tlp) value per species. When we compared the possible model combinations using this additional effect, we found that the combined model (the full model from above plus tlp) is significantly the best model in explaining resistance values. However, one anomaly we experienced is that Pinus strobus has NA for a tlp value. To make the model comparison robust, we removed P. strobus and its canopy/subcanopy resistance values from the dataframe. Then when we compare the possible models, we get the same kind of result as before (see below), where while the combined model is best, it is slightly better than the random model (with just species and year as effects) and the full model (everything except tlp). Removing the outliers also creates the same output, though the random model becomes the best by a slight margin.

example of model comparisons including tlp as a fixed effect. Notice the similarity in the bottom 3 models compared to the others.

For more visualization, see below a plot of the resistance values (excluding P.strobus) with a normal distribution fitting.

And here is a plot of the residuals from the combined model (lmm.combined). This distribution is not significantly different from the full model (without tlp but with P. strobus).

Next steps

Check out recovery and resilience values to see if there's any difference (assuming not) low priority
Let’s look at the ANOVA test and/or Likelihood Ratio Test comparing the random model to the model with canopy position when only the two most extreme outliers are excluded. If I recall, the effect was not terribly far from significant when values>3 were excluded, and it may come out significant.
It may also be worth trying to use the 4 canopy position classes. There’s increasing uncertainty as you go back in time, but I still wonder if that would matter. low priority
interaction with topographic position? masl would be fixed effect

mcgregorian1 commented 5 years ago

Next steps 12 March

results below

Take out 1911, 1947, and 1991
Run lmm with these out and see if get anything different
Try anova as well, check the likelihood ratio test (LRT)

Per species (all of these will be determined after taking out the years' data above)

make box plot for each species resistance values
count of values>1 for each species and for each canopy position
check out ring porosity values (table in Ryan's paper)
Run model with both ring porosity and tlp

Results

1-3. The lmm (top) and anova (bottom) were run with only the three main drought years as confirmed by NOAA northern VA PDSI values: 1966, 1977, 1999. We're seeing that the full model is the best, though now it is marginally better than the model that only includes canopy position and species. It is clear species is still the main driver, but position now plays a larger role.

Box plot per species with resistance values

4.1. Here is a graph of resistance values by species by canopy position (click on it to open in new window expanded)

count of values >1 for each species and each canopy position
I added in ring porosity qualifications from Ryan's paper, and I get results that the best model is one that includes species, canopy position, and ring porosity (treating ring porosity as a fixed effect). Slightly worse is a model with only ring porosity and species, and slightly worse than that is a model that includes position, ring porosity, species, and year (the combined effects).
I added both ring porosity and tlp, and found a code that can quickly loop through all iterations of model combinations. Running these together while keeping year as a random effect for all combinations (to make sure drought effect is represented), I find that the best model is with position, year, species, and tlp (adding in ring porosity ["rp"] makes it slightly worse.

Next Steps

@teixeirak this is a quick summary of what we talked about. Add more variables when we get data:

masl (based on Jonathan Thompsons DEM data)
tree size (dbh of trees going back in time (using allometries from Krista)
tree age

mcgregorian1 commented 5 years ago

@teixeirak I've run the models with old_dbh (the ring widths) as a fixed effect, and the result is dbh_old is definitely significant. In the chart below you can see how the first model possibility that doesn't include dbh_old isn't until last row in this chart (17 out of 64).

Interesting that canopy position doesn't even make it in until the 5th model, but ultimately the delta AICc between that and the top model is only 1.66, so it's not too far off.

teixeirak commented 5 years ago

Wow, that's a lot to sort out! Instead of dbh, let's use predicted height, as that's far more likely to be the variable that actually matters. Because we're just predicting it from DBH, it would be repetitive to include both.

teixeirak commented 5 years ago

We have concluded that TLP is a significant driver. I will close this issue, which is now better defined by new issue #7.

mcgregorian1 commented 5 years ago

Hi @teixeirak

Do you have the original data source for the TLP measurements? I think you mentioned this before but I've lost the location of it.

teixeirak commented 5 years ago

https://github.com/EcoClimLab/HydraulicTraits/blob/master/data/SCBI/processed_trait_data/SCBI_all_traits_table_species_level.csv

mcgregorian1 commented 5 years ago

Perfect, thank you!

SCBI-ForestGEO / McGregor_climate-sensitivity-variation