SolarArbiter / solarforecastarbiter-core

Core data gathering, validation, processing, and reporting package for the Solar Forecast Arbiter
https://solarforecastarbiter-core.readthedocs.io
MIT License
33 stars 21 forks source link

check_ghi_clearsky too sensitive for subhourly data #123

Open wholmgren opened 5 years ago

wholmgren commented 5 years ago

validator.check_ghi_clearsky flags GHI values that are greater than 1.1 times the clear sky expectation. Short over irradiance events larger than this are normal in partly cloudy skies. This makes the flag not helpful for detecting truly problematic data. A few options:

  1. Just raise the limit.
  2. Make a function or table that sets the limit based on the interval length.

probably do the same thing for the POA function.

Also the return section of the docstring should probably read something like "True if ghi is greater than kt_max times the clear sky value."

cwhanse commented 5 years ago

There are some papers that present statistics on over-irradiance events, both magnitude and duration. I don't think there's a nicely presented figure that relates these two quantities (magnitude as a function of duration), but something reasonable could be extracted from the papers, I believe.

Before I start chasing that down, I want to be sure that we agree on the intent of this check function: is it to check for reasonable measured irradiance values, or, is it to flag irradiance values that should be excluded from comparison with a forecast? To me, its a stretch to expect a forecast to predict over-irradiance. But I see the point about labeling reasonable measurements.

wholmgren commented 5 years ago

Good question. I thought it was to perform a check for reasonableness. Persistence predicts over-irradiance, so I think it's fair game. But I'm open to discussion. What's the most useful way to look at this function in the context of #196? A check for reasonableness for intervals longer than 15 minutes? Something else?

cwhanse commented 5 years ago

Overirradiance rarely lasts more than 2min (e.g. Figures 7 and 8 here it is also very localized (within a km). The current function's check for reasonable is suitable for intervals of 5 min, maybe a little strict for shorter times. We could relax the threshold to kmax=1.2. We could implement kmax as a function of time interval.

wholmgren commented 5 years ago

On second thought, we already have check_irradiance_limits_QCRad and associated functions to check for physical limits, so maybe we should just leave this function alone.

cwhanse commented 5 years ago

check_irradiance_limits_QCRad will flag GHI > extraterrestrial, so it will flag over-irradiance. Maybe this issue is about workflow (when to use check_ghi_clearsky) than what check_ghi_clearsky does.

wholmgren commented 4 years ago

Did we decide how to resolve this issue on a call? I don't remember and don't see it in any notes.

It seems to me that the function (perhaps combined with other calling functions) should check for physical reasonableness for data with any interval_length between 1 and 60 minutes. That could require a solution similar to #124.

Or we can close this issue and make a new one about a new flag to assess physical reasonableness that is more strict than single-component QCRad.

cwhanse commented 4 years ago

I don't think we have resolved it. The idea of a limit based on interval_length is a good one. My offer to pull together a table from literature is open.