Improve model-based validations

uace-azmet / azmet-forecast-qa

Developing QA/QC routines for AZMet

0 stars 1 forks source link

Improve model-based validations #15

Closed Aariq closed 1 year ago

Aariq commented 1 year ago

Model diagnostics for the current "seasonal naive" model for solar radiation show autocorrelation and non-normal (leptokurtic) residuals. See if this can be improved by maybe figuring out how to use ARIMA models with fable

jeremylweiss commented 1 year ago

With solar radiation, I think a theoretical model that calculates values based on latitude and day-of-year for a clear day (no clouds) would be the best first step. For QA/QC of station solar radiation values, our biggest concerns are dust/mud/bird crap/etc. partially obscuring the sensor and artificially lowering values on clear days, as well as sensor drift between calibration visits (typically every quarter). Being able to compare clear-day values between the theoretical model and station measurements would be valuable. Perhaps plotting a time series of such on the 'AZMet QA/QC dashboard' would be good?

Aariq commented 1 year ago

"Seasonal naive" is equivalent to ARIMA(0,0,0)(0,1,0)[365] (i.e. an ARIMA with no auto-regressive component, only a seasonal component). Seasonal naive seems to be as good or better that what the auto ARIMA selects for temperature and solar radiation. Auto ARIMA also suggests something slightly different for different stations, so maybe seasonal naive is a good one-size-fits-all model to start with. Although, I'm surprised that an auto-regressive component doesn't seem to improve the model (if it was hot yesterday, it should be hot today, right?).

Aariq commented 1 year ago

Exploring using GAMs instead of ARIMA:

The benefits here would be: 1) ability to use stations as a random effect so they could inform eachother 2) ARIMA assumes normally distributed residuals, which isn't the case with some of the variables. This would allow for using different distributions such as scaled t for leptokurtic residuals and even mixture distributions like zero-adjusted gamma (for precip) in the gamlss package. vignette 3) it might even be faster??

Aariq commented 1 year ago

Plenty of options with GAMs to fit the data better, but seems like GAMs are bad at forecasting.