2015-06-30 Remove wind trend from NLDAS

lawinslow commented 9 years ago

How does RMSE and bias change when wind trend is removed from NLDAS data?

We first noticed this in captains log #64

Why are the black dots (NLDAS) increasing towards the end of the timeseries? Can we correct for that?

lawinslow commented 9 years ago

What do the data look like?

The full timeseries of NLDAS wind data aggregated across a bunch (100) of our lake sites.

It looks to be a step change, not a trend. The average annual values across our 1979-2012 range look like this

The change occurs between 2001 and 2002. Hard to visually see a difference in distribution

But we can see it when we fit the data to a weibull distribution.

> fitdist(subset(sp, year>2001)$WindSpeed, 'weibull')
Fitting of the distribution ' weibull ' by maximum likelihood 
Parameters:
      estimate Std. Error
shape 3.005518 0.03401985
scale 5.830320 0.03101341

> fitdist(subset(sp, year<2001)$WindSpeed, 'weibull')
Fitting of the distribution ' weibull ' by maximum likelihood 
Parameters:
      estimate Std. Error
shape 3.012133  0.0247867
scale 5.372733  0.0210710

You can see the difference in the scale factor.

How do we handle this

I propose we scale the wind data after 2001 using a multiplication factor. (Proposed value, 0.921, which is the quotient of the before and after 2001 scale factors).

I'm going to see how RMSE changes if we do this. Trip report later.

lawinslow commented 9 years ago

As point of interest, here is the weibull paramters for scaled wind speed from NTL-LTER airport met station.

library(fitdistrplus)
library(LakeMetabolizer)
> fitdist(head(wind.scale.base(wnd$wnd, 3), 30000), 'weibull')
Fitting of the distribution ' weibull ' by maximum likelihood 
Parameters:
      estimate  Std. Error
shape 1.636043 0.007673722
scale 3.690860 0.013661206

Lower than we get with NLDAS

jordansread commented 9 years ago

wow. I wonder if the NLDAS folks know about that step change. I will see if there is a forum or something.

lawinslow commented 9 years ago

Full cal run with NLDAS. Through 2012 (note, comparisons with downscaled climate models will differ as they are only compared through 1999 because of model time coverage)

Sim	RMSE	Bias
Standard NLDAS	2.584166	0.06114929
Step Fix NLDAS	2.553134	0.1457446

jordansread commented 9 years ago

If you are only scaling after 2001 and this is only pre-1999, will the numbers differ?

lawinslow commented 9 years ago

No. I'm running this through 2012.

lawinslow commented 9 years ago

Ok, the increase in bias suggests that perhaps the winds after 2001 are actually more characteristic than those before 2001 (though decrease RMSE suggests otherwise??).

Let's see what happens if we flip this correction and adjust the winds before 2001.

  bfr_2001 = nldas$time < as.POSIXct('2001-12-31')
  nldas$WindSpeed[bfr_2001] = nldas$WindSpeed[bfr_2001] * 1.0857

jordansread commented 9 years ago

wild. You running the inverse now?

lawinslow commented 9 years ago

Yup

lawinslow commented 9 years ago

Sim	RMSE	Bias
Standard NLDAS	2.584166	0.06114929
Step Fix NLDAS	2.553134	0.1457446
Pre-2001 step fix	2.612	-0.00242

Hmmm

jordansread commented 9 years ago

smaller bias, larger error. moving target

jordansread commented 9 years ago

@lawinslow just curious - when you do bias, do you get a different picture if you only look at the best 95% of the residuals? Just wondering if extreme outliers have much of an impact on the bias numbers.

lawinslow commented 9 years ago

Not really. For that last one, if you drop the outer 10%, RMSE goes to 2.37043

resids = na.omit(all_cal$Observed_wTemp - all_cal$Modeled_wTemp)
med = median(resids)
high = med + 2*(quantile(resids, 19/20, na.rm=TRUE) - med) 
low = med + 2* (quantile(resids, 1/20, na.rm=TRUE) - med)
inner_resids = resids[resids< high & resids > low] 
sqrt(mean(inner_resids^2, na.rm=TRUE))

jordansread commented 9 years ago

and bias isn't affected?

lawinslow commented 9 years ago

A little, but not much. -0.01318243 What's your thought process on this?

jordansread commented 9 years ago

I guess I was thinking that similar to how we dealt with the modeling errors before, we should toss out the really bad fits, so they don't steer us off. But, it doesn't look like there is enough of a change to make this something that should be applied.

That said, I assume the distribution fitting is pretty robust, but does it result in similar mean(wind^3) for the season? Since the power is the cube, those tails are really important, which is probably what you were seeing in the plot above with mean(wind) for the season. Do you feel confident that the shift is appropriate for the data?

lawinslow commented 9 years ago

Good question. I'll take a look.

lawinslow commented 9 years ago

Using this correction

after_2001 = nldas$time > as.POSIXct('2001-12-31')
nldas$WindSpeed[after_2001] = nldas$WindSpeed[after_2001] * 0.921

and then calculating annual averages of wind work (wind^3). Red is raw NLDAS. Blue is fixed.

Well behaved distribution correction. At least at the annual level.

jordansread commented 9 years ago

nice! curious how the sims turn out w/ this.

lawinslow commented 9 years ago

Based on NLDAS 1979-2012 sims

ME Trends Before wind correction

nldas grid sens me

ME Trends after correction

nldas windfix grid sens me

SP Trends before wind correction

nldas grid sens sp

SP Trends after correction

nldas windfix grid sens sp

SP Empirical Trends (since 1984)

grid yday empir sens sp

jordansread commented 9 years ago

@lawinslow my question earlier was about the bias - I thought the table above was from a different correction, which didn't do the scaling according to average wind work. did you have the bias and rmse numbers for that?

lawinslow commented 9 years ago

@jread-usgs How do you do the scaling according to average wind work?

jordansread commented 9 years ago

this: @lawinslow

lawinslow commented 9 years ago

Ok, I thought you meant there was some way to scale based on wind^3 (like (0.921*wind^3)^1/3 or something). Do you see the RMSE and bias values here?

jordansread commented 9 years ago

yes I see those, but I am asking if those apply to the 0.921 correction or the other corrections that seem to distribution based.

jordansread commented 9 years ago

I will look at this w/ NARR for WI

lawinslow commented 9 years ago

There's only one style of bias correction I've used (the multiplicative one, applied to either pre-2001 or post-2001 data). The distribution work was used to objectively calculate the 0.921 coefficient.

jordansread commented 9 years ago

ok, makes sense now.

USGS-R / mda.lakes