OHI-Science / ohi-science.github.io

Ocean Health Index - website
ohi-science.org
7 stars 7 forks source link

Lag for temporal reference pt and trend (number of data points) #8

Open jennifergriffiths opened 8 years ago

jennifergriffiths commented 8 years ago

This issue comes from working with the BHI project ECO calculation but I think it is generally applicable. It concerns lag years, how we think about them, and what the means for the number of data points.

In general terms, our model is: Current value / Ref point. Ref. point = value 5 years ago.

In our code we say the lag = 5. Therefore if the current year = 2015. The reference year = 2010. For current year 2014, ref=2009 etc.

For the trend calculation, should there then be 6 data points? For the trend over the "previous 5 years," are we still saying current year(2015) minus 5? Meaning that we would have six data points for the status from the year 2010 to 2015?

See the manual on trend calculation (6.3.4 Trend Calculation) where when describing the linear model it has a comment "# select the most recent 5 years of data" but the coding results in 6 data points.

Melsteroni commented 8 years ago

Trend is typically calculated using 5 years of data points (e.g., 2010, 2011, 2012, 2013, 2014). To establish trend years, we use the most recent year minus 4.

Thanks for pointing out the discrepancy in the manual, we will update this section!

It is good to be consistent with trend calculations (unless there is a good reason to deviate from the standard protocol), however because the trend estimates change in status per year the use of 5 vs. 6 years of data will not typically have a large effect (essentially it is similar to using a sample size of 5 vs. 6 to estimate an average).

jennifergriffiths commented 8 years ago

Thanks!

Quick follow-up. Okay. So we have 5 data points for the trend. For a temporal reference point when we write "the reference point is the value five years ago"? would we still do: ref_point_year = year - 5

Quick note, the NP code in the BHI functions.r also is pretty explicit about having 6 data points (for 5 time intervals of a year) for trend. Not sure the origin of this code, but might be worth looking to see where else that occurs. I'll look at the BHI code as I go.

Melsteroni commented 8 years ago

I agree: If "the reference point is the value five years ago", then it would be ref_point_year = year - 5 (the trend and reference point years aren't related so they do not need to be the same)

Yes, the logic to use 5 time "intervals" (i.e., 6 data points) to calculate trend crept in at one point, but it doesn't make sense because regression models are calculated using data points and not intervals. We have eradicated this issue in the global analyses, but it sounds like there are lingering cases. Please let us know as you find them!

Thanks!

On Fri, Mar 25, 2016 at 7:18 AM, jennifergriffiths <notifications@github.com

wrote:

Thanks!

Quick follow-up. Okay. So we have 5 data points for the trend. For a temporal reference point when we write "the reference point is the value five years ago"? would we still do: _ref_pointyear = year - 5

Quick note, the NP code in the BHI functions.r also is pretty explicit about having 6 data points (for 5 time intervals of a year) for trend. Not sure the origin of this code, but might be worth looking to see where else that occurs. I'll look at the BHI code as I go.

— You are receiving this because you commented. Reply to this email directly or view it on GitHub https://github.com/OHI-Science/ohi-science.github.io/issues/8#issuecomment-201303448