Collinearity of Crown position and Height

teixeirak commented 3 years ago

From @ValentineHerr (issue #105): "Crown position vs Height, I think they should never be in the same model as they are obviously collinear."

We've discussed this before (here and here, and maybe more). My take has been that yes, they are collinear, and there's a corresponding biological challenge of disentangling the two, which we cover in the discussion. I like being able to have them in the same model and see which comes out stronger.

Am I wrong here? Should we instead be running two sets of models? (That would complicate presentation...). Or perhaps run a preliminary test to see which is the stronger predictor (height, obviously), and include just that in the final models?

mcgregorian1 commented 3 years ago

The simplest answer is to just say yes, correlation is 0.73 we know height is stronger, so just keep it and throw crown position as either supplementary info / ancillary. I think crown position is overall weaker because at least height was obtained from different, standardized methods, whereas crown position - albeit done by one person (which helps) - is entirely subjective based on where the person is standing and the surrounding trees. Of course, labeling a tree as intermediate and suppressed is more accurate than labeling dominant compared to co-dominant.

teixeirak commented 3 years ago

I'm fine with that.... It's a bit of reworking of the results and text, but not a huge deal. It won't fundamentally change any conclusions. Your call.

mcgregorian1 commented 3 years ago

I think it would make the most sense to remove it, as much as I do want to keep in crown position because it is valuable data...

On Tue, Jul 14, 2020 at 11:41 AM Kristina Anderson-Teixeira < notifications@github.com> wrote:

I'm fine with that.... It's a bit of reworking of the results and text, but not a huge deal. It won't fundamentally change any conclusions. Your call.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/SCBI-ForestGEO/McGregor_climate-sensitivity-variation/issues/106#issuecomment-658253754, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJNRBEPHBSEUN5OCRIH64V3R3R4CFANCNFSM4OZUHUDQ .

--

Ian McGregor

Ph.D. Student | Center for Geospatial Analytics

He/Him/His

College of Natural Resources

Jordan Hall 4120 | Campus Box 7106

North Carolina State University

2800 Faucette Dr.

Raleigh, NC 27695 USA imcgreg@ncsu.edu | 714-864-1005 | geospatial.ncsu.edu

ValentineHerr commented 3 years ago

Or perhaps run a preliminary test to see which is the stronger predictor (height, obviously), and include just that in the final models?

I like this solution or Ian's solution.

I don't think you can tell what variable comes out stronger as both coefficients and standard errors are likely to be wrong when collinear variables are together in a model. Also, it is even trickier here because crown position is a factor with 4 levels, so it has 2 extra parameters compared to a continuous variable...

teixeirak commented 3 years ago

Okay, we'll do this. Ian's solution and the preliminary test are effectively the same, and because Ian's is a bit easier, we can do that..

I think it would make the most sense to remove it, as much as I do want to keep in crown position because it is valuable data...

@mcgregorian1 , we'll still use the crown position data. It's featured pretty prominently in Fig. 3 (height profiles), and we won't want to change that.

teixeirak commented 3 years ago

So, @mcgregorian1 , please go ahead and re-run everything, including tables S4-S5, with CP out.

teixeirak commented 3 years ago

Could you please also run a version of the full models with CP instead of height? I'd at least like to see it, and quite possibly present as an SI table.

teixeirak commented 3 years ago

@mcgregorian1 , the more I think about this, the more I think the full model with CP will be very important. Without showing that, we'd have to make a lot of changes to the presentation.

mcgregorian1 commented 3 years ago

So you think to leave it?

mcgregorian1 commented 3 years ago

I think given it wasn't a major thing for the reviewers, I'd be inclined to leave it, knowing that if suddenly there's pushback on a second round of edits that then we'd have to drop it

teixeirak commented 3 years ago

Oh, sorry, I think that last statement was confusing.... what I meant to say was that we really want to see a version of the full model with CP but not height.

mcgregorian1 commented 3 years ago

Oh, sorry, I think that last statement was confusing.... what I meant to say was that we really want to see a version of the full model with CP but not height.

gotcha

teixeirak commented 3 years ago

I think that what we'll find is that CP comes out significant in much fewer models than height. That at least allows us to say that height is a better predictor.

mcgregorian1 commented 3 years ago

Could you please also run a version of the full models with CP instead of height? I'd at least like to see it, and quite possibly present as an SI table.

Here is the result (Saved as top_models_dAIC_reform_CP.csv). It is as you expected.

this was done by switching out height with crown position. So I tested all possible combinations of the following model: "resist.value ~ position_all*TWI.ln+height.ln.m+position_all+TWI.ln+PLA_dry_percent+mean_TLP_Mpa+year+(1|sp/tree)"
in other words, put position in the place of the interaction and only have height appear once

teixeirak commented 3 years ago

hold on, what we want is two models: "resist.value ~ height.ln.m*TWI.ln+PLA_dry_percent+mean_TLP_Mpa+year+(1|sp/tree)" "resist.value ~ canopy position + TWI.ln+PLA_dry_percent+mean_TLP_Mpa+year+(1|sp/tree)"

mcgregorian1 commented 3 years ago

So, @mcgregorian1 , please go ahead and re-run everything, including tables S4-S5, with CP out.

So this is an interesting outcome. I barely started running the normal Rt without crown position, and even in the base models (Table 4), there's no candidate variable. All dAICs are <0, meaning the null model (without the traits) is better.

teixeirak commented 3 years ago

Hmmm... that seems to contradict the results in the last submitted version. Only difference is that now we have TWI in the model.

And is this the latest code? Why do we suddently have "1964.1966" back?

mcgregorian1 commented 3 years ago

I can confirm that something has gone wrong with all of this. I have to look more into it

mcgregorian1 commented 3 years ago

I'm really confused. Maybe because I'm trying this on my Mac and some of the R packages were updated? But suddenly I'm getting that none of the traits come out as improving the model. I ran this code on Sunday and it was perfectly fine.

teixeirak commented 3 years ago

What happens if you re-run the same code you've been using before this change?

mcgregorian1 commented 3 years ago

What happens if you re-run the same code you've been using before this change?

See #107

Otherwise, I won't move forward until we know about #105

mcgregorian1 commented 3 years ago

hold on, what we want is two models: "resist.value ~ height.ln.m*TWI.ln+PLA_dry_percent+mean_TLP_Mpa+year+(1|sp/tree)" "resist.value ~ canopy position + TWI.ln+PLA_dry_percent+mean_TLP_Mpa+year+(1|sp/tree)"

I did this two ways, because I still wasn't fully sure.

In general I'm not quite sure what to make of the results. You can see from the first way (direct comparison) that height is stronger for all years combined and 1966, while crown position is better for 77/99. That seems straightforward.
However, the other comparison complicates things for me. For example, the placement of position in the best models doesn't match where it falls in # 1. I think part of that is because position_all takes the place of height when height isn't in the model, whereas height stands for itself when position isn't in the model.
- Worth noting that height is in the best models (<2dAIC) 8 times, while position is in its best models 11 times.

What are your thoughts? Do you want any of these as specific tables?

first way: direct comparison between the two models

Note position beats out height for 1977 and 1999.

second way: all combinations

And here is the comparison between all combinations of the first model variables (base; this is already saved with the _CPout files) and all combinations of the second model variables.

First, with height interaction:

And second with crown position:

teixeirak commented 3 years ago

Sorry I wasn't able to respond on this last night. I think what we want is a table with coefficients for all the top models for each year, but never allowing both height and CP together. So it we'd have something parallel to current S6, but with height and canopy position together disallowed. Does that make sense?

It would probably also make sense to add a table (like the first) comparing set models, but slightly modified to remove the height*TWI interaction (so that models are exactly parallel). I'll come back to this.

teixeirak commented 3 years ago

@mcgregorian1, I manually created what I think should be the final S6 and S7 by deleting the rows with both height and CP. I checked that results would match those posted above, but please confirm, and we should have that done in the code. I just did the temporary manual fix so that I could have this to look at while I work on related revisions.

mcgregorian1 commented 3 years ago

Ok sounds good. I can fix this in the code later for both Rt and arimaratio.

teixeirak commented 3 years ago

Thanks. Note that the previous top Rt model for 1999 included both CP and H, which means that there may be some additional models making the AIC cutoff. (I adjusted AIC manually.) Table 1 and full results can be finalized (#109) when we check this.

teixeirak commented 3 years ago

(didn't mean to close those)

mcgregorian1 commented 3 years ago

@teixeirak is this issue still valid about including models that have height or position but not both? Based on #112 it seems no? We should thus only keep the models that don't include crown position at all?

teixeirak commented 3 years ago

Correct. Closing this.

SCBI-ForestGEO / McGregor_climate-sensitivity-variation

Collinearity of Crown position and Height #106

first way: direct comparison between the two models

second way: all combinations