Prediction Function Not Working Properly

djhocking commented 9 years ago

I was getting bad predictions but wasn't sure if it was in the data prep, model, or prediction function. I did some testing and found that it's in the prediction function somewhere but I can't figure out where. It could be in the organizing of the coefficients, especially the conditional (site, huc, year-specific) coefficients. The predictions fall on the 1:1 line but with way more scatter than there should be.

I wasn't organized enough in my git usage so it's hard to track the problem, but I have a large git commit history so I'm going to create a backup branch (backup20150811) and a test branch (iss32) and see if I can revert to code that was working and find the problem that way. Unfortunately, it's even possible that it's in the coefficient output organization in the conteStreamTemperature_northeast repo.

djhocking commented 9 years ago

It turns out that the issue was with adding the autoregressive structure into the predictions. I wasn't using the deployment start correctly for the full time series (only the first deployment at a given site).

Interestingly and unfortunately, the predicted trend before accounting for autocorrelation in the residuals has a RMSE = 1.6 C. This suggests that where data isn't available there's a lot of uncertainty in the accuracy of our predictions (similar to other regional models). I don't think this was showing up before because my validation data has temperature observations so when I ran the predictions it took this into account. The real validation (for predicting to unmonitored days or sites) should be using the trend before the temporal autocorrelation correction (i.e. there should be no residuals to correct for).

The question remains, why doesn't our model consistently do better at predicting temperatures for days without data and why did it seem to do better in the past? Obviously it did better in the past because we were comparing apples and oranges by including the observations and autocorrelation in the validation predictions. But I think it's more than that since we had better RMSE in earlier models with the AR1 term. My guess is that we were using data more concentrated in MA and CT with less variability. The data we were using previously had also all be checked by hand more carefully and we were likely avoiding using much data with influence from impoundments. Basically, I think we had a great central-New England model for minimally affected streams and now we have a more general model that works across a large region and huge variety of conditions. The current model requires only interpolation and not extrapolation and therefore probably better represents our uncertainty when predicting in space and time.

It's also possible that I'm not capturing the synchronized period well enough in parts of ME.

My only real surprise is that the random site and huc effects don't do a better job when there is data. It looks like some sites are consistently over or under predicted even when they have a lot of data but then the AR1 term corrects for that. I'm not sure why that is.

The question is what to do about this. One option is to do nothing and accept that this is a very good regional model. Another is to filter out and avoid predicting at sites near dams/impoundments until we can handle those effects better. A third option is to see if we can improved the synchronized season across the region. A fourth option is to restrict analysis and predictions to smaller streams. Currently I predict to streams < 200 km^2. That is pretty big and far enough downstream that there are likely many upstream dams. For comparison West Brook where we survey is ~27 km^2. I drove out to a site in Greenfield on the Green River yesterday after work (thanks Ana for finding rivers of different sizes!) to see a 200km^2 stream section. It was ~22m wide and moving fast enough (and muddy enough) that I couldn't see the bottom. Maybe 2m deep in the center. We could potentially run the same model for bigger streams in the future (including data from gaged sites) since the physical properties are going to be different for these streams. Excluding bigger streams and sites with loggers near impoundments could help so we're not trying to model so many temperatures over 25 C where evaporative cooling could be an issue as well.

Here is a plot of the observed vs. expected for the modeled data with the trend in red and the AR1 corrected in blue.

plot

djhocking commented 8 years ago

It looks like the biggest problem is with bad data getting through QAQC

Conte-Ecology / conteStreamTemperature

Prediction Function Not Working Properly #32