Closed Aariq closed 1 year ago
With solar radiation, I think a theoretical model that calculates values based on latitude and day-of-year for a clear day (no clouds) would be the best first step. For QA/QC of station solar radiation values, our biggest concerns are dust/mud/bird crap/etc. partially obscuring the sensor and artificially lowering values on clear days, as well as sensor drift between calibration visits (typically every quarter). Being able to compare clear-day values between the theoretical model and station measurements would be valuable. Perhaps plotting a time series of such on the 'AZMet QA/QC dashboard' would be good?
"Seasonal naive" is equivalent to ARIMA(0,0,0)(0,1,0)[365] (i.e. an ARIMA with no auto-regressive component, only a seasonal component). Seasonal naive seems to be as good or better that what the auto ARIMA selects for temperature and solar radiation. Auto ARIMA also suggests something slightly different for different stations, so maybe seasonal naive is a good one-size-fits-all model to start with. Although, I'm surprised that an auto-regressive component doesn't seem to improve the model (if it was hot yesterday, it should be hot today, right?).
Exploring using GAMs instead of ARIMA:
The benefits here would be:
1) ability to use stations as a random effect so they could inform eachother
2) ARIMA assumes normally distributed residuals, which isn't the case with some of the variables. This would allow for using different distributions such as scaled t for leptokurtic residuals and even mixture distributions like zero-adjusted gamma (for precip) in the gamlss
package. vignette
3) it might even be faster??
Plenty of options with GAMs to fit the data better, but seems like GAMs are bad at forecasting.
Model diagnostics for the current "seasonal naive" model for solar radiation show autocorrelation and non-normal (leptokurtic) residuals. See if this can be improved by maybe figuring out how to use ARIMA models with
fable