monocongo / climate_indices

Climate indices for drought monitoring
https://monocongo.github.io/climate_indices/
Other
335 stars 160 forks source link

IndexError encountered when computing SPI #527

Open jessefriend opened 1 year ago

jessefriend commented 1 year ago

Describe the bug I encountered an "IndexError: index 159 is out of bounds for axis 1 with size 159" when attempting to use the SPI computation function from the climate-indices Python package.

To Reproduce Steps to reproduce the behavior:

  1. Installed the climate-indices package in a conda environment.

  2. Ran the following command: spi --periodicity monthly --scales 1 2 3 6 9 12 24 36 48 --calibration_start_year 1981 --calibration_end_year 2023 --netcdf_precip /path/to/my/netcdf/precip_data.nc --var_name_precip precip --output_file_base /path/to/my/output/CHIRPS --multiprocessing all --save_params /path/to/my/output/CHIRPS_fitting.nc --overwrite

  3. Encountered the following error: IndexError: index 159 is out of bounds for axis 1 with size 159

The full traceback is as follows:

multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
  File "/home/user/miniconda3/envs/climate_indices/lib/python3.7/multiprocessing/pool.py", line 121, in worker
    result = (True, func(*args, **kwds))
  File "/home/user/miniconda3/envs/climate_indices/lib/python3.7/multiprocessing/pool.py", line 44, in mapstar
    return list(map(*args))
  File "/home/user/miniconda3/envs/climate_indices/lib/python3.7/site-packages/climate_indices/__spi__.py", line 1212, in _apply_to_subarray_gamma
    periodicity=args["periodicity"],
IndexError: index 159 is out of bounds for axis 1 with size 159
"""
  File "climate_indices/__spi__.py", line 1502, in main
    _compute_write_index(kwrgs)
  File "climate_indices/__spi__.py", line 700, in _compute_write_index
    args=args,
  File "climate_indices/__spi__.py", line 1007, in _parallel_fitting
    pool.map(_apply_to_subarray_gamma, chunk_params)
  File "multiprocessing/pool.py", line 268, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "multiprocessing/pool.py", line 657, in get
    raise self._value
IndexError: index 159 is out of bounds for axis 1 with size 159

Expected behavior I expected the climate-indices package to compute the SPI without any issues.

Desktop (please complete the following information):

Additional Context I was following this guideline on computing SPI using CHIRPS data.

monocongo commented 1 year ago

Hi @jessefriend thanks for this error report.

If you can please re-install the climate_indices package from this development branch to see if that fixes your issue: https://github.com/monocongo/climate_indices/tree/issue_522_pyproject_poetry

Also if you can please post a link to the dataset used for /path/to/my/netcdf/precip_data.nc in the command listed above then hopefully I can use that to successfully reproduce the error.

jessefriend commented 1 year ago

Hi @monocongo, thanks for the fast feedback.

I tried running it again with the development branch and still ran into the same issue.

Here is a link to the dataset on WeTransfer: https://we.tl/t-3LBTVdu77D

Here is a description of it:

<class 'netCDF4._netCDF4.Variable'> float32 time(time) units: days since 1981-1-1 00:00:00 standard_name: time calendar: gregorian axis: T unlimited dimensions: time current shape = (509,) filling on, default _FillValue of 9.969209968386869e+36 used <class 'netCDF4._netCDF4.Variable'> float32 lon(lon) units: degrees_east standard_name: longitude long_name: longitude axis: X unlimited dimensions: current shape = (159,) filling on, default _FillValue of 9.969209968386869e+36 used <class 'netCDF4._netCDF4.Variable'> float32 lat(lat) units: degrees_north standard_name: latitude long_name: latitude axis: Y unlimited dimensions: current shape = (186,) filling on, default _FillValue of 9.969209968386869e+36 used <class 'netCDF4._netCDF4.Variable'> int32 crs() long_name: Lon/Lat Coords in WGS84 grid_mapping_name: latitude_longitude longitude_of_prime_meridian: 0.0 semi_major_axis: 6378137.0 inverse_flattening: 298.257223563 unlimited dimensions: current shape = () filling on, default _FillValue of -2147483647 used <class 'netCDF4._netCDF4.Variable'> float32 precip(time, lat, lon) _FillValue: -9999.0 units: mm standard_name: convective precipitation rate long_name: Climate Hazards group InfraRed Precipitation with Stations time_step: dekad missing_value: -9999.0 geospatial_lat_min: -4.675 geospatial_lat_max: 4.62 geospatial_lon_min: 33.89 geospatial_lon_max: 41.85 grid_mapping: crs unlimited dimensions: time current shape = (509, 186, 159) filling on

iferencik commented 1 year ago

Hello @monocongo ,

i have looked into the same issue (I am colleague with @jessefriend ) and i realized the climatology_dataset (CHIRP) expects

   expected_dims_3d_climate = {"lat", "lon", "time"}

here https://github.com/monocongo/climate_indices/blob/a7ee9efc547b70a48809a0e421cbb54f485ff3f8/src/climate_indices/__spi__.py#L263

next, the code

https://github.com/monocongo/climate_indices/blob/a7ee9efc547b70a48809a0e421cbb54f485ff3f8/src/climate_indices/__spi__.py#L276-L280

did not catch this difference

a =  {'lat', 'lon', 'time'}
b =  {'time', 'lon', 'lat'}
a ==  b
True

while something like these would

for x, y in zip(a, b):
    assert x == y

hope this helps

jessefriend commented 1 year ago

I can confirm this issue is gone when I changed the data dimensions within the netCDF to {'lat', 'lon', 'time'}

monocongo commented 1 year ago

@iferencik You have found a sleeper bug, thank you! We're comparing the two as sets, which have no order, but the order is important, as in this case. I will leave this issue open for now as a reminder to fix this.

@jessefriend Thanks for your fast follow-up to confirm that this is fixed for you now. Getting the data cleaned and ready for processing is tricky, and I had lots of issues trying to handle that for users by including some wrangling in the processing scripts, but this proved to be problematic since it used NCO and that package is not well-supported on Windows.