Closed sarah-friesen closed 5 years ago
For the purposes of the heatmap, it makes sense to visualize the entire time series averages because the purpose is to visualize the variability of each year in relation to any other year on its own. Whereas for conducting a t-test to compare the current year fork lengths to the time-series fork length average, it doesn't make sense to include the current year in the calculation of the time-series average because the samples you'd compare wouldn't be independent of each other. I'm not convinced I had this in my head clearly enough when conducting some significance tests so It needs to be reviewed.
The temporal restriction is only done in the 'current_year` because in 2018 we conducted a number of additional seines later in the season than we ever have before. So those observations needed to be removed to ensure we're comparing the same period of time in all years.
In that case, I am going to put the same temporal restriction on the time series data as on the current year for consistency.
I have added the 2019 temperature data to the time series averages for all comparisons, except for running the t-test
Right now, there are many time series average calculations that include the current year, then the time series averages are compared to the current year's averages. It seems strange to me to include the current year in the time series average, but I am not familiar with time series data so perhaps this is common practice?
In addition, the time series averages are currently determined without any temporal restrictions, while the current year averages are calculated after temporally restricting the data to [32 < ydat <213]. If it makes sense to do this temporal restriction, why isn't this done for the whole time series?