pvlib / pvlib-python

A set of documented functions for simulating the performance of photovoltaic energy systems.
https://pvlib-python.readthedocs.io
BSD 3-Clause "New" or "Revised" License
1.17k stars 993 forks source link

pvlib.clearsky.detect_clearsky ValueError: index can't contain negative values #1209

Closed kurt-rhee closed 3 years ago

kurt-rhee commented 3 years ago

Describe the bug Hello, I am using pvlib.clearsky.detect_clearsky on a set of data from a meto station and the outputs of pvlib.clearsky.simplified_solis.

To Reproduce Steps to reproduce the behavior:

meas_ghi.head()
TIMESTAMP
2016-06-22 04:00:00-08:00      6.352319
2016-06-22 05:00:00-08:00    133.270304
2016-06-22 06:00:00-08:00    323.471665
2016-06-22 07:00:00-08:00    510.011417
2016-06-22 08:00:00-08:00    699.448683

Freq: H, Name: GHI_SR20_TC_Avg, dtype: float64
c_sky_ghi.head()
TIMESTAMP
2016-06-22 04:00:00-08:00      0.000000
2016-06-22 05:00:00-08:00     50.632885
2016-06-22 06:00:00-08:00    238.667713
2016-06-22 07:00:00-08:00    446.367962
2016-06-22 08:00:00-08:00    643.812969

Freq: H, Name: ghi, dtype: float64
c_sky_ghi.head().index
DatetimeIndex(['2016-06-22 04:00:00-08:00', '2016-06-22 05:00:00-08:00',
               '2016-06-22 06:00:00-08:00', '2016-06-22 07:00:00-08:00',
               '2016-06-22 08:00:00-08:00'],
              dtype='datetime64[ns, Etc/GMT+8]', name='TIMESTAMP', freq='H')

detect_c_sky = pvlib.clearsky.detect_clearsky(
    measured=meas_ghi.head(),
    clearsky=c_sky_ghi.head(),
    times=c_sky_ghi.head().index,
    window_length=3,
    mean_diff=50,
    max_diff=50,
    lower_line_length=5,
    upper_line_length=10,
    var_diff=0.005,
    slope_dev=8,
    max_iterations=1,
    return_components=False
    )

Expected behavior I would expect to not get a value error, I don't believe that any of my indeces have negative values.

Screenshots image

Versions:

kandersolar commented 3 years ago

Hi @kurt-rhee -- the window_length value you're using is a problem. It's supposed to be Length of sliding time window in minutes. Must be greater than 2 periods, which means it's trying to use a window length of 3 minutes with hourly data, which probably isn't what you want.

Maybe we should add a check on the output of _get_sample_intervals and raise a more helpful error if samples_per_window == 0.

cwhanse commented 3 years ago

@kurt-rhee the detect_clearsky algorithm was developed for data at a frequency of about 1 minute. Besides the window_length issue, I doubt the default thresholds will give satisfactory results for 1 hour averaged data.

If the data are point-in-time values on each timestamp (rather than averaged over each interval), this paper may help with choosing appropriate thresholds.

Maybe we should add a check on the output of _get_sample_intervals and raise a more helpful error if samples_per_window == 0.

+1

kurt-rhee commented 3 years ago

Understood, thank you so much for your quick response. That is very helpful.