consider sapwood depth as a predictor

teixeirak commented 5 years ago

@mcgregorian1 , sapwood depth is a potentially interesting predictor. We have tree-level data on all cored trees here (sapwood file). Here's a figure based on the data :

We'd want to calculate the ratio of sapwood area to total wood. We could use either (a) this ratio at the time of coring or (b) some extrapolation as to what it would have been when the tree was smaller. I don't think we'd want to just predict based on DBH, though, as that would throw out differences among trees. My memory on this is fading, but I know that Jenny McGarvey found a significant negative correlation between elevation and the proportion sapwood area of Carya (?-I think). Such intraspecific differences may help to explain the elevation trend.

mcgregorian1 commented 5 years ago

Ok, that makes sense. But if I don't use dbh as the predictor (i.e. calculating the regression equations from that graph), then I'm not sure how to estimate the ratio, given that our measurement for height is already using dbh

mcgregorian1 commented 5 years ago

@teixeirak

Trying to calculate bark thickness equation (relating diam w/o bark to bark_thickness), I'm getting essentially perfect straight lines (especially since each sp has one bark thickness). Are these viable to use to calculate the old dbh?

teixeirak commented 5 years ago

Can you please identify the (piece of) script that does this?

teixeirak commented 5 years ago

The thickest bark thickness you're getting is ~4.5mm, which is too small. Something is wrong.

teixeirak commented 5 years ago

Here are the steps:

use the file SCBI_bark_depth.csv. Just disregard table S3 in my FE paper, better to work with original data.
calculate diameter.without.bark.mm as DBH.mm.2008-2*bark.depth.mm
ln-transform both diameter.without.bark.mm (dependent) and bark.depth.mm (independent)
fit a linear model per species.
use this model to predict ln[bark.depth.mm] from ln[diameter.without.bark.mm]
take the exponent of ln[bark.depth.mm] to get bark.depth.mm.
check that calculated values look reasonable based on raw data.

Use the same process to predict bark.depth.mm from DBH.mm.2008. You should get equations similar to table S3 in my FE paper (but don't worry if they're not identical-- just go with your own equations for consistency).

mcgregorian1 commented 5 years ago

Hi @teixeirak

Thanks for putting this up and sorry for being confused on this/wrapping my head around it.

In creating a regression equation for each species, I'm finding I'm getting worse results than you got in your paper. When I take stock of the difference between my predicted values and actual values, I get a range of -6.831566 to 5.128121 (this is for all species).

What do you consider to be reasonable for calculated values?

mcgregorian1 commented 5 years ago

This is a boxplot of the ranges between predicted and absolute values:

teixeirak commented 5 years ago

This is pretty variable data, and isn't going to have a huge impact on the result, so don't worry too much about it. If we wanted great predictions, we'd need more data, but for our purpose, this will be fine.

mcgregorian1 commented 5 years ago

Ok sounds good. I just wanted to double check before moving forward

mcgregorian1 commented 5 years ago

Just to confirm, I'm using this bark thickness to calculate old dbh, which in turn calculates old height

teixeirak commented 5 years ago

Correct.

teixeirak commented 5 years ago

You'll also need it for sapwood area ratio.

mcgregorian1 commented 5 years ago

@teixeirak

I've found one fram that I had in subcanopy because of the dbh, but I looked at it again because I was getting NaNs, and the dbh was for the second stem, not the actual cored first stem. I double-checked AJ's data for crown positions, and she couldn't find the first stem (presumably it fell soon after it was cored).

Based on its dbh, I'd put it in canopy, but regarding whether it was specifically C or D I have no idea. What do you recommend? I can easily take it out of the analyses.

mcgregorian1 commented 5 years ago

I'm also giving fagr a bark thickness of 0 for all years since it's negligible (based on what we were talking about). Is this ok?

mcgregorian1 commented 5 years ago

In addition, I'm confirming I'm using the ln[Y] = ln[Y0] + z*ln[DBH (mm)] equation to get sapwood area (from your paper)

teixeirak commented 5 years ago

I'm also giving fagr a bark thickness of 0 for all years since it's negligible (based on what we were talking about). Is this ok?

yes

teixeirak commented 5 years ago

I've found one fram that I had in subcanopy because of the dbh, but I looked at it again because I was getting NaNs, and the dbh was for the second stem, not the actual cored first stem. I double-checked AJ's data for crown positions, and she couldn't find the first stem (presumably it fell soon after it was cored).

Based on its dbh, I'd put it in canopy, but regarding whether it was specifically C or D I have no idea. What do you recommend? I can easily take it out of the analyses.

remove

teixeirak commented 5 years ago

In addition, I'm confirming I'm using the ln[Y] = ln[Y0] + z*ln[DBH (mm)] equation to get sapwood area (from your paper)

correct. Please check against the file in Dryad to make sure that your calculations look about right.

mcgregorian1 commented 5 years ago

In addition, I'm confirming I'm using the ln[Y] = ln[Y0] + z*ln[DBH (mm)] equation to get sapwood area (from your paper)

correct. Please check against the file in Dryad to make sure that your calculations look about right.

My calculations are consistently under the dryad file calculations (on a per-sp basis), by roughly at least 5 cm. I've used the equation, though, so I'm unsure why I'm consistently underestimating.

mcgregorian1 commented 5 years ago

Now that dbh has been re-calculated, I decided to have it in so see how it affects the full best model.

Removing dbh_ln gives us this:

Removing dbh and converting sap_ratio to sap_ratio_ln gives us this:

Based on this, I'm assuming we want to change around the base model for the table. What model do you think is best? I don't see a reason to do log transformation of sap_ratio except for keeping consistency with doing the same for height

teixeirak commented 5 years ago

I don't think that there's any theoretical reason/ need to use ln(sapwood ratio).

Do you understand why canopy position suddenly drops so drastically in importance?

Could you please make a plot of sap_ratio vs DBH?

teixeirak commented 5 years ago

My calculations are consistently under the dryad file calculations (on a per-sp basis), by roughly at least 5 cm. I've used the equation, though, so I'm unsure why I'm consistently underestimating.

I'm not sure what's going on. There are a few things that could be going on:

For trees that were cored, the file on Dryad uses actual measurements (as opposed to the equation's predictions) to get sapwood depth, and from that sapwood area. This means that you shouldn't expect all values to match. However, there shouldn't be a systematic underestimate.
Problem may be that a bias correction needs to be applied. I'll have to dig into this.

I also think it would be helpful for me to dig up the original data.

My data management system was not as good back then! Wish we had all this on GitHub.

mcgregorian1 commented 5 years ago

Ok, that's what I thought for sap_ratio.

The only thing I can think of is that when I was going back through and calculating dbh, I was using the power fitting equation, whereas now I'm not using it. This could mean that my dbh calculation wasn't properly scaled (as far as I could tell after going through everything step by step this morning). Potentially now that the dbh is more accurate, position_all isn't as important.

As for a plot of sap_ratio to dbh, it is here:

mcgregorian1 commented 5 years ago

My calculations are consistently under the dryad file calculations (on a per-sp basis), by roughly at least 5 cm. I've used the equation, though, so I'm unsure why I'm consistently underestimating.

I'm not sure what's going on. There are a few things that could be going on:

For trees that were cored, the file on Dryad uses actual measurements (as opposed to the equation's predictions) to get sapwood depth, and from that sapwood area. This means that you shouldn't expect all values to match. However, there shouldn't be a systematic underestimate.

Problem may be that a bias correction needs to be applied. I'll have to dig into this.

I also think it would be helpful for me to dig up the original data.

My data management system was not as good back then! Wish we had all this on GitHub.

The systematic underestimate is from me checking about 5 individuals from different species and finding equivalent individuals (based on dbh and sp) in the dryad data, then comparing their sapwood area. That may not be the most robust but the dryad data uses different trees than what I have in my sample set.

teixeirak commented 5 years ago

The only thing I can think of is that when I was going back through and calculating dbh, I was using the power fitting equation, whereas now I'm not using it. This could mean that my dbh calculation wasn't properly scaled (as far as I could tell after going through everything step by step this morning). Potentially now that the dbh is more accurate, position_all isn't as important.

Actually, its better to use a power fit equation than the log-log fit and back-transformation. It does tend to underestimate high-end values (which may also be what you're seeing with sapwood).

teixeirak commented 5 years ago

Those sapwood area ratios look awfully small. We expect values between 0 and 1, but probably noting below 0.1 and with some species getting close to 1. Check your units. Do you have sapwood area in cm^2 and total area in mm^2?

mcgregorian1 commented 5 years ago

Those sapwood area ratios look awfully small. We expect values between 0 and 1, but probably noting below 0.1 and with some species getting close to 1. Check your units. Do you have sapwood area in cm^2 and total area in mm^2?

Yes you are completely right. That's my bad. I had the area without bark in mm (from the dbh).

mcgregorian1 commented 5 years ago

The only thing I can think of is that when I was going back through and calculating dbh, I was using the power fitting equation, whereas now I'm not using it. This could mean that my dbh calculation wasn't properly scaled (as far as I could tell after going through everything step by step this morning). Potentially now that the dbh is more accurate, position_all isn't as important.

Actually, its better to use a power fit equation than the log-log fit and back-transformation. It does tend to underestimate high-end values (which may also be what you're seeing with sapwood).

Ah ok. The steps I used to calculate dbh this time around are documented here (and are in the code as well). Is this technically not correct then?

#1. Calculate diameter_nobark for 2008 = DBH.mm.2008-2*bark.depth.mm
#2. ln-transform both diam_nobark_2008 (x) and bark.depth.mm (y)
#3. Fit a linear model, and use model to predict ln(bark.depth.mm)
#4. Take exponent of bark.depth.mm and make sure predicted values look good.
#5. Get mean bark thickness per species in 2008.
## The equation for calculating old dbh, using 1999 as an example, is
## dbh1999 = dbh2008 - 2(ring.width2013 - ring.width1999) - 2(bark.depth2008) + 2(bark.depth1999)

## using the dataset from calculating the regression equations, we can get mean bark thickness per species in 2008.

#6.Thus, the only value we're missing is bark depth in 1999.
## This is ok, because we can calculate from the regression equation per each species (all we need is diam_nobark_1999).Calculate diam_nobark_1999 using
## diam_nobark_1999 = dbh2008 - 2*(bark.depth2008) - 2*(sum(ring.width1999:ring.width2008))

#7. Calculate bark thickness using regression equation per appropriate sp
## ln(bark.depth.1999) = intercept + ln(diam_nobark)*constant
## bark.depth.1999 = exp(ln(bark.depth.1999))

#8. Add to solution from #6 to get full dbh1999
## dbh1999 = diam_nobark_1999 + 2*bark.depth.1999

mcgregorian1 commented 5 years ago

Using the raw data I've calculated regression equations that resemble below. The fit seems to be pretty good (noting the bark thickness I subtracted from dbh is from 2008 despite the sapwood depth for the raw data was measured in 2010).

From there I've replotted the graph from earlier, and I think it looks more like it should

teixeirak commented 5 years ago

I agree, that looks better. But what's the species with sapwood ratio of 0?

mcgregorian1 commented 5 years ago

Apparently it's a mix of 120 fram, quru, and quve individuals. None of them are at 0, but rather 0.0004 or something like that.

mcgregorian1 commented 5 years ago

Right, my bad on all of this. I had accidentally put the z element as the r^2 value. This is now correct. I can say with confidence that that one anomaly line represents the fram individuals.

mcgregorian1 commented 5 years ago

Now with the correct sapwood ratio calculated, this is what I'm seeing for all models:

For comparison, the coefficients of the full model (number 64) are here:

teixeirak commented 5 years ago

I can say with confidence that that one anomaly line represents the fram individuals.

Hmmm... FRAM sapwood area was the most uncertain of all. We might revisit this one.

NEVERMIND! I was being fast and didn't pay enough attention to the acronym. FAGR is the one that was most uncertain. FRAM is fine. I checked the raw data, and that line makes sense.

teixeirak commented 5 years ago

FYI, original sapwood data are now posted here.

teixeirak commented 5 years ago

Looking back at my original files, my sapwood area ratios look somewhat different. This may be an order of operations issue. I'll have to retrieve my original scripts and figure out what's going on. All_species_ratio

mcgregorian1 commented 5 years ago

I can say with confidence that that one anomaly line represents the fram individuals.

Hmmm... FRAM sapwood area was the most uncertain of all. We might revisit this one.

NEVERMIND! I was being fast and didn't pay enough attention to the acronym. FAGR is the one that was most uncertain. FRAM is fine. I checked the raw data, and that line makes sense.

Ok, that's good to hear. I put 0 bark thickness for FAGR anyways, so we know that if anything should be amended, it should be that

teixeirak commented 5 years ago

Looking back at my code, the figure above represents basal area including bark, whereas yours excludes bark. I think its better to exclude bark.

However, what was your order of operations? In the plot above, I took the ratio before fitting regressions, and that's preferable to taking the ratio of two predicted values. This may explain why FRAM is an outlier in your plots but not mine.

mcgregorian1 commented 5 years ago

Ah ok.

Below are my steps for calculating the ratio. Are you saying you took the ratio from the raw sapwood area data and then fit the the ratio in a regression equation?

#get radius.w/o.bark mm and convert to cm for ratio further down
#area without bark = pi*(radius.w/o.bark)^2 (cm^2)

#calculate sapwood area
##sapwood area = tree area (minus bark) - heartwood area

##subtract bark thickness from dbh
##NOTE bark thickness is from 2008, even tho sap data collected 2010

#heartwood radius = 0.5*dbh – sapwood depth (mm)
#Heartwood area = pi*(heartwood radius)^2 (mm^2)

#Sapwood area = pi*((0.5*dbh)^2) – heartwood area

#calculate the regression equations and solve for log sapwood area
#take the exponent of the log sapwood area, then calculate ratio

#ratio = sapwood area:area without bark

mcgregorian1 commented 5 years ago

I've plotted the sapwood ratio against dbh for the plants in the original sapwood dataset you found and I get the same graph as you did.

I believe the reason for mine looking so different might be because of the cored trees we have. The trees we have in the cored dataset (that I'm using) have <50% overlap with the trees in the sap dataset, thus the main way for me to get that original sapwood area is from a fit regression equation.

Originally I was using a regression equation to solve for sap area from DBH. From your order of operations, did you use DBH to solve directly for sap ratio? If I do that, the data is much more scattered, and the r2 values go from >60% to <35% in all cases. (Apologies for the jumbled equations and points)

teixeirak commented 5 years ago

Here's my code for calculating sapwood area ratio:

I then fit power functions using nonlinear regression.

Despite lower R2, I see this as the preferred order of operations, so let's go with that.

When you have a chance, please run the statistics using that method of calculation, and we can then determine our best/ base model.

mcgregorian1 commented 5 years ago

Ok. I double-checked and I do have that order of operations. I've made sure the units are consistent, and I get a regression equations looking like this.

I've also updated the tables here (results_individual_years1 and full_models_dAIC1). I'll now address your most recent comment in #23.

* I'm keeping the "1" in the file names for now just because it's been almost a month since I updated these. I'll clear out the old ones soon.

mcgregorian1 commented 5 years ago

From #23

I was looking back at the data and was realizing the dbh (and thus the height and sap ratio) hadn't merged properly and were all 1 value per tree, instead of different values for each year.

After fixing this, sap ratio is significant for all years except for 1966. I've uploaded this to github (full_mod_all)

teixeirak commented 5 years ago

Having given this more thought, I don't think we'll want to include predicted sapwood ratio in the main model (for disentangling size effects), and possibly not at all. This is because its simply a function of DBH (and species), which means we're just testing a slightly different iteration of DBH (note that the same problem applies to height, but there's a tighter relationship there). There are countless aspects of tree form and function that scale with DBH and could affect drought sensitivity, so predicted sapwood ratio is not all that meaningful. The more meaningful test will be using the measured sapwood ratio values on the subset of trees for which we have data.

mcgregorian1 commented 5 years ago

Ok, that makes sense. I'll leave the code in case we want to come back to it.

mcgregorian1 commented 5 years ago

I'm going to close this issue since we decided to not include sap_ratio

SCBI-ForestGEO / McGregor_climate-sensitivity-variation

consider sapwood depth as a predictor #21