pvlib / pvlib-python

A set of documented functions for simulating the performance of photovoltaic energy systems.
https://pvlib-python.readthedocs.io
BSD 3-Clause "New" or "Revised" License
1.19k stars 998 forks source link

Improve error handling in clearsky._interpolate_turbidity #645

Closed cwhanse closed 2 weeks ago

cwhanse commented 5 years ago

Relevant to #403

When latitude and longitude arguments are not float but, e.g., singleton numpy.array the user gets an error from numpy.interp. Error detection and messaging could be improved.

Describe the solution you'd like Adding a try/except block around _interpolate_turbidity with an error message suggesting to check the type of latitude and longitude

wholmgren commented 5 years ago

The lookup_linke_turbidity function also fails for lat/lon 1-D array input if interp_turbidity=False:

        if interp_turbidity:
            linke_turbidity = _interpolate_turbidity(lts, time)
        else:
            months = time.month - 1
>           linke_turbidity = pd.Series(lts[months], index=time)
E           IndexError: index 5 is out of bounds for axis 0 with size 1

So if we want to handle non-scalar input then it's not enough to handle it around _interpolate_turbidity.

It is important to be clear that scalar np.array(latitude) is ok but 1-D np.array([latitude]) is not. Demonstrated here:

@requires_tables
@pytest.mark.parametrize('lat,lon,interp_turbidity', [
    (32.125, -110.875, True),
    (32.125, -110.875, False),
    (np.array(32.125), np.array(-110.875), True),
    (np.array(32.125), np.array(-110.875), False),
    pytest.param(np.array([32.125]), np.array([-110.875]), True,
                 marks=pytest.mark.xfail, strict=True),
    pytest.param(np.array([32.125]), np.array([-110.875]), False,
                 marks=pytest.mark.xfail, strict=True),
])
def test_lookup_linke_turbidity(lat, lon, interp_turbidity):
    times = pd.date_range(start='2014-06-24', end='2014-06-25',
                          freq='12h', tz='America/Phoenix')
    if interp_turbidity:
        # expect same value on 2014-06-24 0000 and 1200, and
        # diff value on 2014-06-25
        expected = [3.11803278689, 3.11803278689, 3.13114754098]
    else:
        expected = [3., 3., 3.]
    expected = pd.Series(expected, index=times)
    out = clearsky.lookup_linke_turbidity(times, lat, lon,
                                          interp_turbidity=interp_turbidity)
    assert_series_equal(expected, out)
$ pytest pvlib/test/test_clearsky.py::test_lookup_linke_turbidity --pdb                                                                                                                                                                               
platform darwin -- Python 3.7.2, pytest-4.2.0, py-1.7.0, pluggy-0.8.1
rootdir: /Users/holmgren/git_repos/pvlib-python, inifile:
plugins: xdist-1.22.5, mock-1.10.0, forked-0.2, cov-2.6.1
collected 6 items                                                                                                                                                                                                                                                                                                             

pvlib/test/test_clearsky.py ....xx         

One option could be checking here:

    lt_h5_file = tables.open_file(filepath)
    try:
        if np.array(latitude_index).ndim > 0 or np.array(longitude_index).ndim > 0:
            raise IndexError
        lts = lt_h5_file.root.LinkeTurbidity[latitude_index,
                                             longitude_index, :]
    except IndexError:
        raise IndexError('Latitude must be scalar between 90 and -90, '
                         'longitude between -180 and 180.')
    finally:
        lt_h5_file.close()
wholmgren commented 5 years ago

I'm also ok with closing and moving on.

cwhanse commented 2 weeks ago

Advancing the minimum numpy version appears to have addressed the behavior that raised the issue (different results for float, numpy singleton, and numpy 1D array). This code now runs without error:

import numpy as np
import pandas as pd
from pvlib import clearsky

test_data = [
    (32.125, -110.875, True),
    (32.125, -110.875, False),
    (np.array(32.125), np.array(-110.875), True),
    (np.array(32.125), np.array(-110.875), False),
    (np.array([32.125]), np.array([-110.875]), True),
    (np.array([32.125]), np.array([-110.875]), False),
    ]

times = pd.date_range(start='2014-06-24', end='2014-06-25',
                      freq='12h', tz='America/Phoenix')
for data in test_data:
    lat, lon, interp_turbidity = data

    if interp_turbidity:
        # expect same value on 2014-06-24 0000 and 1200, and
        # diff value on 2014-06-25
        expected = [3.11803278689, 3.11803278689, 3.13114754098]
    else:
        expected = [3., 3., 3.]
    expected = pd.Series(expected, index=times)
    out = clearsky.lookup_linke_turbidity(times, lat, lon,
                                          interp_turbidity=interp_turbidity)

Closing with the assumption that it's no longer an issue.