SCBI-ForestGEO / McGregor_climate-sensitivity-variation

repository for linking the climate sensitity of tree growth (derived from cores) to functional traits
0 stars 0 forks source link

test trait effects individually #36

Closed teixeirak closed 5 years ago

teixeirak commented 5 years ago

@mcgregorian1, I'm concerned that we have a lot traits that may be interacting in funny ways in the full model. To test the effects of traits, let's have the null model include height, (canopy position), year (categorical), and random effect (individual nested in species). Let's create a table with those results, and also pull out coefficients. If the coefficient switches between these parsed-down models and the full model, that indicates a problem.

mcgregorian1 commented 5 years ago

@teixeirak

I've finished the table for individually-tested traits. I made the table as a csv so it's easier to read (found here).

I've also compared those coefficients with the coefficients from the full models, shown here. The only one that's different is LMA. When you're free I can talk about this in person more. image

teixeirak commented 5 years ago

Thanks! Note that WD is also different. LMA and WD are the two that were behaving contrary to theoretical expectations in the full model, and they are the two that are reversed. Good. More soon!

mcgregorian1 commented 5 years ago

This is the final result when we take out all dAIC <2. It seems now ring porosity has also been booted out of the top model, in addition to wood density.

image

teixeirak commented 5 years ago

can you please provide the coefficients?

mcgregorian1 commented 5 years ago

Whoops sorry about that. Here they are. They look to be going the direction we'd think. image

mcgregorian1 commented 5 years ago

I was thinking since I have the first table with all the traits tested individually, and now the best model, I should make a separate table showing only the top best model across the individual drought years?

teixeirak commented 5 years ago

Okay, this top model looks good/ makes sense.

Yes, let's look at the best model for each drought year.

mcgregorian1 commented 5 years ago

@teixeirak

I've fully populated the table now, which is found here.

To clarify:

I think what's interesting about this is the same trend we saw when first comparing the overall drought years with the individual ones, is that no individual drought comes close to matching the overall drought trend. In addition to the dAIC becoming positive, the coefficients change as well. Would this be evidence of what we see generally in climate science, where only by looking at the long-term do you see better representation of trends?

It appears that for all the variables, only TWI and TLP are in the top model for >2 scenarios.

teixeirak commented 5 years ago

Thanks! Some comments:

mcgregorian1 commented 5 years ago

Three things on this:

  1. I had messed up the calculation of dAIC earlier on, so actually rp and WD shouldn't have been in the best model anyways (that probably explains why they were contrary to the others).

  2. Valentine made the suggestion and I agreed, that we'd include "year" as a tested variable to prove its importance. That is now included.

  3. I've finished making a master table showing each variable tested individually against a null model for each of the drought scenarios plus all of them combined, which can be viewed here.

    • Notice how the coefficients of position_all and rp are not constant in direction across all scenarios
    • Subsetting this for each scenario for all variables with dAIC > 2 gives me the table below. There's some overlap between them, but no more than that.
    • Based on this, do we take the best model then to be the one that has the most number of overlaps (in this case, only distance.ln.m and TWI make the cut), or do we present the best model for each separately, with the reasoning that each scenario is different (but then focus more on the overall scenario ["all"] for a longer-term trend)? I'm currently thinking the latter, what do you think? image
teixeirak commented 5 years ago
* Based on this, do we take the best model then to be the one that has the most number of overlaps (in this case, only distance.ln.m and TWI make the cut), or do we present the best model for each separately, with the reasoning that each scenario is different (but then focus more on the overall scenario ["all"] for a longer-term trend)? I'm currently thinking the latter, what do you think?

I agree.

teixeirak commented 5 years ago

I don't think we should include elevation and distance in any analysis. They're inferior to TWI, both ecologically and usually statistically, and they just complicate the interpretation.

teixeirak commented 5 years ago

It's interesting that position seems to come out mostly consistent (although rarely significant), with dominant always lower than codominant.

mcgregorian1 commented 5 years ago

I decided to test something. The first four models are the top model for each year, using only the variables that had dAIC > 2 from the all-year scenario in the table. The bottom four models are the top model for each year using all the variables from the table.

Top variables

1966: image image

1977: image image

1999: image image

All years: image image

All variables

1966: image image

1977: image image

1999: image image

All years: image image

I'm confused on how to interpret/present this. I guess in a way this makes sense, since we were thinking of prescribing a set of variables for the individual drought years based on a trend seen only at the long-time scale (e.g. LMA, WD, and rp all were nixed from the combined-year model). We can still present a different model for each scenario based on these bottom models here, but it means we'd have to rethink how we'd present the hypothesis-testing table.

mcgregorian1 commented 5 years ago

The variables with differing coefficients compared to the master table are:

Top variables

1999: height.ln.m (negative, was positive)

All variables

1977: rp-semiring (negative, originally positive) Combined years: LMA (negative, originally positive), WD (negative, originally positive)

mcgregorian1 commented 5 years ago

@teixeirak when you get a chance, can I get your opinion on this please?

teixeirak commented 5 years ago

I don't trust these "all variables" models-- I'm concerned that they're over-parameterized. Please go with the top variables models.

mcgregorian1 commented 5 years ago

Ok.

As I understand it this is where we stand:

  1. We tested each trait individually for each drought scenario, using height as the null model, intending to use the traits dAIC>2 to determine the best model for each one.
  2. However, we noticed that in 1977 (e.g.), this would yield a model with only TWI.
  3. Thus we decided to find the best model for each individual scenario using all variables that had dAIC>2 at some point, even though perhaps in the specific years they did not.

Is this correct? I think I kept getting caught up by how there are interactions we're not seeing, for example how position_all wasn't dAIC>2 for the combined-year scenario, yet when testing only these "top variables", it does come out in the top model. Same thing for ring porosity for the combined scenario, 1966, and 1977.

these are the variables that have dAIC>2 image

teixeirak commented 5 years ago

I thought the description of our method would be this:

Considering all droughts combined and for each individual drought, we tested our predictions by comparing a model with the relevant variable against a null model (Table X). When the dAIC>2, we considered the prediction supported. ....

To determine the best multivariate model for all droughts combined and for each individual drought, we .... To avoid over-parameterization of the model, we use included as candidate variables only those with dAIC>2 in the all droughts model.

teixeirak commented 5 years ago

Is that correct? I want to make sure I'm following corretly.

mcgregorian1 commented 5 years ago

Exactly, so that's the thing. I'm still having trouble justifying to myself prescribing what works best in the all-droughts model as being best for the individual years. Using that method, how do we justify the reality that when we include rp in the individual drought years, it always comes out as significant? Or do we ignore that because it's not part of this protocol we discussed?

teixeirak commented 5 years ago

Okay, how about this?

To determine the best multivariate model for all droughts combined and for each individual drought, we .... To avoid over-parameterization of the model, we use included as candidate variables only those with dAIC>2 in one or more of the of the individual models."

Is that what you did for the "top variables" models above? I notice that canopy position is in there, when its not dAIC>2 in the all droughts model.

teixeirak commented 5 years ago

I agree that its not ideal to limit the set of variables to those in the all-drought scenario, but there does need to be some limitation. Wood density and SLA in particular seem to be very inconsistent--acting more as free parameters than as meaningful variables.

mcgregorian1 commented 5 years ago

Okay, how about this?

To determine the best multivariate model for all droughts combined and for each individual drought, we .... To avoid over-parameterization of the model, we use included as candidate variables only those with dAIC>2 in one or more of the of the individual models."

This would allow us to include position_all and rp, definitely. I'm wondering what the justification would be on this if we were challenged on it? Would it simply be that since each individual drought is different we thought it best to take into account all possible top variables?

Is that what you did for the "top variables" models above? I notice that canopy position is in there, when its not dAIC>2 in the all droughts model.

Yes, my mistake there. I initially included both rp and position_all and noticed they appeared in the best models, but then realized we had said not to include them, hence my hesitation at moving forward.

I agree that its not ideal to limit the set of variables to those in the all-drought scenario, but there does need to be some limitation. Wood density and SLA in particular seem to be very inconsistent--acting more as free parameters than as meaningful variables.

Agreed. This is why I was hoping we could do something like what you've suggested (assuming we can ecologically justify it), because yes, I don't think WD and LMA should be represented in these last tests.

teixeirak commented 5 years ago

Okay, I think we have a plan then?

I do think we can justify this method by saying that since each individual drought is different we thought it best to take into account all possible top variables.

mcgregorian1 commented 5 years ago

Ok perfect! I'm on board with this plan.

Thus, for my next steps:

  1. I can now get the best model for each drought scenario and put those in a table, so then the models would be done.
  2. The graphs are close to being done, what's mainly left will be formatting them all together.
  3. Otherwise I believe I have the writing left.

Am I missing anything else here that you can think of?

teixeirak commented 5 years ago

Nothing offhand!

teixeirak commented 5 years ago

Closing (obsolete).