Open MrWindAndSolar opened 5 years ago
Thanks Justin. Let's go with proposal 1 for now. We can pursue one of the other options in the future if we or stakeholders determine that it's worthwhile to do better.
I opened #190 to close this following proposal 1, but I suggest slightly different limits:
@cwhanse did you follow a reference for the temperature and wind speed limits?
@wholmgren no, just put values in that I thought were reasonable bounds for the continental US.
I'm going to merge #190 but leave this issue open since it has good ideas for future improvements that someone, someday might be interested in implementing.
I was reading Livera et al (see below) for other reasons and noticed they used
I don't object. Nice to have a citation also.
After reading Cliff's PVSC paper, I had some ideas for improving the ranges that are being used to QC data that is being ingested into the system.
Current known limits
Temperature: System currently flags <-10C or >50C. -10C is well within the range of typical US temperatures. +50C is well into the tails. Wind speed: Anything >60 m/s is flagged. This is very high as a sustained wind speed (category 4 hurricane)
Proposal 1
Use the following limits that represent what is typical across the entire US Temperature: -35C<T<+45C Wind speed: One really needs to know what the averaging period is. The gust factor (difference between a sustained wind and instantaneous gust) is about 30% between 3 second gusts and 1 minute averages! Assuming 1 minute averages: Wsp<35 m/s Other variables: Let me know if ranges are needed and I can provide.
Proposal 2
Seasonal limits: Temperature: Summer: -5C<T<45C Spring and Fall: -20C<T<40C Winter: -40C<T<30C
No seasonality needed for wind speed
Proposal 3
Download the NOAA Local Climatology Data (LCD) from a grid of stations across the US (there are thousands so a reasonable grid can be assembled). Determine the extremes for each field of interest at each station based on a 30 year record Store these in a database by lat-lon Write a simple routine to find the closest station(s) to a location that is being QC'ed. For example, for temperature: If T>extreme_max-abs(0.1extreme_max) or T<extreme_min+abs(0.1extreme_min) then flag as suspect If T>extreme_max+abs(0.1extreme_max) or T<extreme_min-abs(0.1extreme_min) then flag as bad
The ranges and sensitivities used will vary by field. I don't recommend using this method for wind speed as it varies geographically in a way that won't work well. Fixed wind speed values likely will work better.
While this method requires more upfront work, it is beneficial in that a) fields with strong seasonality like temperature can be considered by month or season in a statistically valid way, b) the impact of geography on each field can be taken into account. For example, temperature data in Hawaii that is below 5C WILL get flagged.
Proposal 4
Limits by region. I can provide additional input if you'd like to go this route