USGS-R / mda.lakes

Wisconsin Lake Modeling Aggregation
2 stars 11 forks source link

2016-05-17 RMSE and trend fit relationship #91

Open lawinslow opened 8 years ago

lawinslow commented 8 years ago

Curious how important it is to hit a trend in getting a good RMSE (root mean square error). Often, RMSE is used to calibrate models, but we are often interested in their ability to re-create trends, not just the annual cycle.

Start with a simple Sine wave. Random variation in observation is normally distributed. Here start with SD of zero. Scaled the data to be roughly water temp ranges (between 0 and 20)

trend = 0.04 #in units per year, kinda like wtemp
freq = 26 #sample freq
year = (1:(30*freq)/freq)
yobs = 10*sin(year*2*pi)+ 10 + trend*year + rnorm(length(year), sd=0)

Then, we create a "model" that is also a sine wave with no noise, adjusted a bit to match the mean of the data, with no attempt to match the trend.

ymod = 10*sin(year*2*pi) + 10 + (trend*max(year)/2)

Over 30 years, looks like this sd0 0

Title gives calculated RMSE, total change from start to end in the observation, and observation noise standard deviation. Interestingly, this model would be "perfect" if the trend weren't present (rmse would be zero), but the mismatch in trend adds ~0.3 to RMSE. A number we might consider a pretty good RMSE.

We have to add a fair bit of noise to the "observations" to get RMSE near the overall observed change.

sd0 6 sd1 2

Now, with SD==1.2, if we change the model to hit the trend correctly, we don't shift the RMSE much at all. hittrend sd1 2

Interesting. RMSE is not a great way to understand our trend fits.

jsta commented 7 years ago

I wondered if the magnitude of the data was masking an error improvement. My work-through of this is at https://gist.github.com/jsta/8643e4dc59dd0706dcf781b7f6776876