Open Kah5 opened 1 year ago
This vignette may be useful for implementing LFO, but we will have to more directly refit the model because we are not using rstanarm or bros.
Another issue is that (I think) we should be doing the 1 step ahead predictions for each tree, in each held-out time point. I have updated the loop in the code to do this, but then we run into issues where not all the trees have increments that extend up to the same year. for example, some have a last year of 1975 while others have a last year of 1997. This means we can only do the lfo for some trees. This should be fine, but I still need to figure out the indexing for this.
We have not tackled one-step ahead validation, but I have included out of sample diameter validation (prediction with posterior estimates, and including mortality), as well as in-sample and held-out increment validation.
Ideally, we want to do a leave future out (LFO) cross validation for our time series data, as in Auger-Méthé et al.(https://esajournals.onlinelibrary.wiley.com/authored-by/ContribAuthorRaw/Auger%E2%80%90M%C3%A9th%C3%A9/Marie), but with LFO, or approximate LFO we would need to refit the model with only some subset of the past increment data, for example 1966-1986 instead of the full dates form 1966-2001. But this means that all DBH measurements would be omitted as well, and the STAN model won't run/be constrained by DBH.
So, I propose trying the LFO model refitting, but keeping the DBH measurements. This still involves refitting the models, which is non-trivial & computationally intensive in our case. I can do it for the smaller models, but for the full regional model it takes awhile.
An alternative is to not refit the model, but walk the estimates from 1986-2001 forward with the posterior values of the model, then use the new diameter estimates to predict the next time point. This isn't truly a "leave future out" cross validation, but it gives us an idea of how biased the forecasts could be.