Closed kandersolar closed 1 year ago
It seems the error happens in the selection of the data in the variable data_9am_12pm
:
https://github.com/pvlib/pvlib-python/blob/5b059143cd16f1d3670a7b4dcc931f1f356a497e/pvlib/tests/iotools/test_ecmwf_macc.py#L70-L73
Which has the following output:
tcwv aod550 precipitable_water
2012-11-01 06:00:00 22.755631 0.223713 2.275563
2012-11-01 09:00:00 19.378843 0.183731 1.937884
2012-11-01 12:00:00 16.191863 0.150101 1.619186
2012-11-01 15:00:00 13.319903 0.130809 1.33199
Only the two middle values (9 am and 12 pm) should have been selected. However, somehow the values before and after are also included, and thus the comparison error happens (as 4 values are compared to 2 values).
I've tracked it down to this change in cftime 1.6.0: https://github.com/Unidata/cftime/pull/273
The difference, as @AdamRJensen notes, is that the behavior of select='before'
and select='after'
have changed; in cftime<1.6.0
they return the specified times if they exist, and the closest preceding/following times otherwise, while in cftime>=1.6.0
they always return the closest preceding/following times. Here are the relevant lines:
https://github.com/pvlib/pvlib-python/blob/55e05776877416fb2bcc43eeee4119b8b0504942/pvlib/iotools/ecmwf_macc.py#L296-L299
I'm not sure what is best for us to do. Switching to select='nearest'
might make sense. Feedback from someone that actually uses iotools.read_ecmwf_macc
would be good!
I doubt anyone really uses this function... it doesn't really seem so from the search results. I could be convinced to remove it.
@mikofski are you still using this function? What was the use case? I imagine that MERRA-2 or ERA5 would be better sources of data these days.
MACC Reanalysis also seems rather outdated based on this link:
The Monitoring Atmospheric Composition and Climate (MACC) Reanalysis is a global reanalysis data set of atmospheric composition (AC), made available by the Copernicus Atmosphere Monitoring Service (CAMS). The dataset covers the period 2003 to 2012.
I am with removing it. I used it to retrieve AOD and precipitable water from the CAMS McClear model, but I agree I think it has been archived. Those dates are correct, so it is over 10 years old by now. (Although I think our Linke turbidity data from SoDa is also that old or older.)
Describe the bug @mdeceglie and I observe
pvlib/tests/iotools/test_ecmwf_macc.py::test_read_ecmwf_macc
to fail locally withValueError: operands could not be broadcast together with shapes (4,) (2,)
.To Reproduce Steps to reproduce the behavior:
conda create -n ecmwf39 python=3.9
conda activate ecmwf39
pip install pvlib[all]
pytest pvlib\tests\iotools\test_ecmwf_macc.py
pytest output
```python pytest pvlib\tests\iotools\test_ecmwf_macc.py ================================================= test session starts ================================================= platform win32 -- Python 3.9.15, pytest-7.2.0, pluggy-1.0.0 rootdir: C:\Users\KANDERSO\projects\pvlib-python plugins: cov-4.0.0, mock-3.10.0, remotedata-0.3.3, rerunfailures-10.3, timeout-2.1.0, requests-mock-1.10.0 collected 3 items pvlib\tests\iotools\test_ecmwf_macc.py ..F [100%] ====================================================== FAILURES ======================================================= ________________________________________________ test_read_ecmwf_macc _________________________________________________ expected_test_data = WindowsPath('C:/Users/KANDERSO/projects/pvlib-python/pvlib/data/aod550_tcwv_20121101_test.nc') @requires_netCDF4 def test_read_ecmwf_macc(expected_test_data): """Test reading ECMWF_MACC data from netCDF4 file.""" data = ecmwf_macc.read_ecmwf_macc( expected_test_data, 38, -122) expected_times = [ 1351738800, 1351749600, 1351760400, 1351771200, 1351782000, 1351792800, 1351803600, 1351814400] assert np.allclose(data.index.view(np.int64) // 1000000000, expected_times) expected_aod = np.array([ 0.39531226, 0.22371339, 0.18373083, 0.15010143, 0.130809, 0.11172834, 0.09741255, 0.0921606]) expected_tcwv = np.array([ 26.56172238, 22.75563109, 19.37884304, 16.19186269, 13.31990346, 11.65635338, 10.94879802, 10.55725756]) assert np.allclose(data.aod550.values, expected_aod) assert np.allclose(data.tcwv.values, expected_tcwv) assert np.allclose(data.precipitable_water.values, expected_tcwv / 10.0) datetimes = (datetime.datetime(2012, 11, 1, 9, 0, 0), datetime.datetime(2012, 11, 1, 12, 0, 0)) data_9am_12pm = ecmwf_macc.read_ecmwf_macc( expected_test_data, 38, -122, datetimes) > assert np.allclose(data_9am_12pm.aod550.values, expected_aod[2:4]) pvlib\tests\iotools\test_ecmwf_macc.py:74: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ <__array_function__ internals>:180: in allclose ??? ..\..\Software\Anaconda3\envs\ecmwf39\lib\site-packages\numpy\core\numeric.py:2265: in allclose res = all(isclose(a, b, rtol=rtol, atol=atol, equal_nan=equal_nan)) <__array_function__ internals>:180: in isclose ??? ..\..\Software\Anaconda3\envs\ecmwf39\lib\site-packages\numpy\core\numeric.py:2375: in isclose return within_tol(x, y, atol, rtol) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ x = array([0.22371339, 0.18373083, 0.15010143, 0.130809 ]), y = array([0.18373083, 0.15010143]), atol = 1e-08 rtol = 1e-05 def within_tol(x, y, atol, rtol): with errstate(invalid='ignore'): > return less_equal(abs(x-y), atol + rtol * abs(y)) E ValueError: operands could not be broadcast together with shapes (4,) (2,) ..\..\Software\Anaconda3\envs\ecmwf39\lib\site-packages\numpy\core\numeric.py:2356: ValueError =============================================== short test summary info =============================================== FAILED pvlib/tests/iotools/test_ecmwf_macc.py::test_read_ecmwf_macc - ValueError: operands could not be broadcast together with shapes (4,) (2,) ============================================= 1 failed, 2 passed in 0.73s ============================================= ```Expected behavior I expect tests to pass. I also expect local test results to be consistent with CI test results.
Versions:
pvlib.__version__
: 0.9.3pandas.__version__
: 1.5.2Additional context The test failure is with
netCDF4==1.6.2
from PyPI. Reinstalling it in the environment created above but from conda-forge (pip uninstall netCDF4; conda install -c conda-forge netCDF4
) results in the test being skipped. At least in my Windows environment, this is because of some incompatibility betweenh5py
andnetCDF4
causingimport h5py, netCDF4
to raiseImportError
butimport netCDF4
to be fine.