monocongo / climate_indices

Climate indices for drought monitoring
https://monocongo.github.io/climate_indices/
Other
338 stars 163 forks source link

ValueError: the first argument to .drop_sel must be a dictionary" #495

Open mjh1366 opened 1 year ago

mjh1366 commented 1 year ago

How to craft a useful, minimal bug report

I'm using the PyPi version of climate indices with input datasets in climate-indices' package: nclimgrid_lowres_prcp.nc

I'm getting this Error" raise ValueError(f"the first argument to .{func_name} must be a dictionary") ValueError: the first argument to .drop_sel must be a dictionary"

Describe the bug

    **raise ValueError(f"the first argument to .{func_name} must be a dictionary")
ValueError: the first argument to .drop_sel must be a dictionary**

To Reproduce

Steps to reproduce the behavior:

1. conda create -n indices_env
2. activate indices_env
3. pip install climate-indices
4. pip install -r requirements.txt
5. pip-compile -o requirements.txt pyproject.toml
6. conda install -c conda-forge nco
7. pip install netcdf4
8. pip install h5netcdf
  1. (env1) E:\climate\climate_indices-master\input>process_climate_indices --index spi --periodicity monthly --netcdf_precip ./nclimgrid_lowres_prcp.nc --var_name_precip prcp --output_file_base ../output/nclimgrid_lowres --scales 6 12 --calibration_start_year 1951 --calibration_end_year 2010 --multiprocessing all
    2023-01-10  19:53:33 INFO Start time:    2023-01-10 19:53:33.396280
    2023-01-10  19:53:33 INFO Computing 6-month SPI/Pearson
    2023-01-10  19:53:33 ERROR Failed to complete
    Traceback (most recent call last):
    File "C:\Program Files\Python310\lib\site-packages\climate_indices\__main__.py", line 1708, in main
    _compute_write_index(kwrgs)
    File "C:\Program Files\Python310\lib\site-packages\climate_indices\__main__.py", line 809, in _compute_write_index
    _drop_data_into_shared_arrays_grid(dataset,
    File "C:\Program Files\Python310\lib\site-packages\climate_indices\__main__.py", line 637, in _drop_data_into_shared_arrays_grid
    dataset = dataset.drop_sel(var_name)
    File "C:\Program Files\Python310\lib\site-packages\xarray\core\dataset.py", line 4486, in drop_sel
    labels = either_dict_or_kwargs(labels, labels_kwargs, "drop_sel")
    File "C:\Program Files\Python310\lib\site-packages\xarray\core\utils.py", line 286, in either_dict_or_kwargs
    raise ValueError(f"the first argument to .{func_name} must be a dictionary")
    ValueError: the first argument to .drop_sel must be a dictionary
    Traceback (most recent call last):
    File "C:\Program Files\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
    File "C:\Program Files\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
    File "C:\Program Files\Python310\Scripts\process_climate_indices.exe\__main__.py", line 7, in <module>
    File "C:\Program Files\Python310\lib\site-packages\climate_indices\__main__.py", line 1708, in main
    _compute_write_index(kwrgs)
    File "C:\Program Files\Python310\lib\site-packages\climate_indices\__main__.py", line 809, in _compute_write_index
    _drop_data_into_shared_arrays_grid(dataset,
    File "C:\Program Files\Python310\lib\site-packages\climate_indices\__main__.py", line 637, in _drop_data_into_shared_arrays_grid
    dataset = dataset.drop_sel(var_name)
    File "C:\Program Files\Python310\lib\site-packages\xarray\core\dataset.py", line 4486, in drop_sel
    labels = either_dict_or_kwargs(labels, labels_kwargs, "drop_sel")
    File "C:\Program Files\Python310\lib\site-packages\xarray\core\utils.py", line 286, in either_dict_or_kwargs
    **raise ValueError(f"the first argument to .{func_name} must be a dictionary")
    ValueError: the first argument to .drop_sel must be a dictionary**

Expected behavior A clear and concise description of what you expected to happen.

Screenshots

Desktop (please complete the following information):

Additional context Add any other context about the problem here.

monocongo commented 1 year ago

Thanks for the reminder about this @mjh1366

This was fixed in #493 but the version on PyPI is out-of-date, so we will rectify this by resolving #489.

In the meantime, you can instead install this package from the master branch which should be OK again.

monocongo commented 1 year ago

A new version is available now via PyPI, please advise if you still have this or any other issues with the latest version there. Thanks!

mjh1366 commented 1 year ago

Thanks for your consideration. Now If I use input datasets in climate-indices' package, there will be no problem:

(env3) E:\climate2\climate_indices-master>process_climate_indices --index spi --periodicity monthly --netcdf_precip ./nclimgrid_lowres_prcp1.nc --var_name_precip prcp --output_file_base ./nclimgrid_lowres --scales 6 12 --calibration_start_year 1982 --calibration_end_year 1997 --multiprocessing all
2023-01-13  19:00:45 INFO Start time:    2023-01-13 19:00:45.683248
2023-01-13  19:00:46 INFO Computing 6-month SPI/Pearson
2023-01-13  19:01:56 INFO Computing 6-month SPI/Gamma
2023-01-13  19:03:25 INFO Computing 12-month SPI/Pearson
2023-01-13  19:04:49 INFO Computing 12-month SPI/Gamma
2023-01-13  19:05:17 INFO End time:      2023-01-13 19:05:17.267028
2023-01-13  19:05:17 INFO Elapsed time:  0:04:31.583780

But if I use my input datasets, the same problem will occur:

(env3) E:\climate2\climate_indices-master>process_climate_indices --index spi --periodicity monthly --netcdf_precip ./nclimgrid_lowres_prcp.nc --var_name_precip precip --output_file_base ./nclimgrid_lowres --scales 6 12 --calibration_start_year 1982 --calibration_end_year 1997 --multiprocessing all
2023-01-13  18:58:22 INFO Start time:    2023-01-13 18:58:22.469086
2023-01-13  18:58:22 INFO Computing 6-month SPI/Pearson
2023-01-13  18:58:22 ERROR Failed to complete
Traceback (most recent call last):
  File "C:\Program Files\Python310\lib\site-packages\climate_indices\__main__.py", line 1715, in main
    _compute_write_index(kwrgs)
  File "C:\Program Files\Python310\lib\site-packages\climate_indices\__main__.py", line 752, in _compute_write_index
    dataset = dataset.drop_sel(var)
  File "C:\Program Files\Python310\lib\site-packages\xarray\core\dataset.py", line 4486, in drop_sel
    labels = either_dict_or_kwargs(labels, labels_kwargs, "drop_sel")
  File "C:\Program Files\Python310\lib\site-packages\xarray\core\utils.py", line 286, in either_dict_or_kwargs
    raise ValueError(f"the first argument to .{func_name} must be a dictionary")
ValueError: the first argument to .drop_sel must be a dictionary
Traceback (most recent call last):
  File "C:\Program Files\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Program Files\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Program Files\Python310\Scripts\process_climate_indices.exe\__main__.py", line 7, in <module>
  File "C:\Program Files\Python310\lib\site-packages\climate_indices\__main__.py", line 1715, in main
    _compute_write_index(kwrgs)
  File "C:\Program Files\Python310\lib\site-packages\climate_indices\__main__.py", line 752, in _compute_write_index
    dataset = dataset.drop_sel(var)
  File "C:\Program Files\Python310\lib\site-packages\xarray\core\dataset.py", line 4486, in drop_sel
    labels = either_dict_or_kwargs(labels, labels_kwargs, "drop_sel")
  File "C:\Program Files\Python310\lib\site-packages\xarray\core\utils.py", line 286, in either_dict_or_kwargs
    raise ValueError(f"the first argument to .{func_name} must be a dictionary")
ValueError: the first argument to .drop_sel must be a dictionary

I upload my input datasets here:

Monthly precipitation data from CORDEX SouthAsia- model remoo2009 File format : NetCDF Cell size (x,y)= 0.5,0.5 GCS_WGS_1984 Parameter name=precip Grid coordinates :
lon : 43.25 to 63.75 by 0.5 degrees_east lat : 24.25 to 40.75 by 0.5 degrees_north New WinRAR ZIP archive.zip

monocongo commented 1 year ago

This results from the file being written by CDO which defaults to using the time dimension as the outermost for the data variables. This package's processing script expects variables with dims = (lat, lon, time) instead. This used to be performed in the script itself but NCO is buggy on Windows and many users reported issues, so we recently removed that from this script. We need to document this better to make users aware that they need to re-orient the data variables to (lat, lon, time) otherwise this will happen. Also BTW, the file you are using does not have units for the precip variable, which also causes an error.

This removal of NCO support should cause a new version since this breaks things for some users. Thanks for bringing this to my attention, @mjh1366

monocongo commented 1 year ago

BTW I misspoke above, the order is off in that file but that wasn't the bug. xarray.Dataset.drop_sel() takes a list and instead the code drops a single variable at a time. I have a fix ready for PR soon, stay tuned...

mjh1366 commented 1 year ago

Thank you so much. I can prepare my data for this package. I use NCO and re-orient the data variable to (lat,lon,time). as well as I define units for precip variable.

ncpdq -O -a lat,lon,time nclimgrid_lowres_prcp.nc nclimgrid_lowres_prcp2.nc
ncwa -a bnds ncpdq nclimgrid_lowres_prcp2.nc nclimgrid_lowres_prcp3.nc
ncks -C -x -v time_bnds nclimgrid_lowres_prcp3.nc nclimgrid_lowres_prcp4.nc
ncatted -O -a name,precip,c,c,precip nclimgrid_lowres_prcp4.nc nclimgrid_lowres_prcp5.nc
ncatted -O -a unite,precip,c,c,millimeter nclimgrid_lowres_prcp5.nc nclimgrid_lowres_prcp6.nc

Thanks for your consideration, @monocongo

Mahsabzg commented 4 months ago

I received the same error. I checked the structure of my data, it looks correct. I appreciate if you could help me.

ncdump -h data.nc
dimensions:
    lat = 240 ;
    lon = 280 ;
    time = 3652 ;
variables:
    double lat(lat) ;
        lat:standard_name = "grid_latitude" ;
        lat:long_name = "latitude in rotated pole grid" ;
        lat:units = "degrees" ;
        lat:axis = "Y" ;
    double lon(lon) ;
        lon:standard_name = "grid_longitude" ;
        lon:long_name = "longitude in rotated pole grid" ;
        lon:units = "degrees" ;
        lon:axis = "X" ;
    float precipitation(lat, lon, time) ;
        precipitation:long_name = "precipitation amount" ;
        precipitation:units = "mm" ;
        precipitation:table = 1 ;
        precipitation:grid_mapping = "rotated_pole" ;
        precipitation:_FillValue = -9999.f ;
        precipitation:missing_value = -9999.f ;
        precipitation:cell_methods = "height: mean" ;
    char rotated_pole ;
        rotated_pole:grid_mapping_name = "rotated_latitude_longitude" ;
        rotated_pole:grid_north_pole_latitude = 49.5 ;
        rotated_pole:grid_north_pole_longitude = -186. ;
    double time(time) ;
        time:standard_name = "time" ;
        time:long_name = "Time variable" ;
        time:units = "days since 2010-01-01 06:00:00" ;
        time:calendar = "proleptic_gregorian" ;
        time:axis = "T" ;
monocongo commented 4 months ago

Please post a link to your data @Mahsabzg I'm overwhelmed at my day job right now so no time to work on this in the near term, but if someone else has the time then they'll need a copy of your data just like I will to diagnose and fix the issue. Please make sure you're using the latest code installed from PyPI or the master branch. PR #505 may have fixed this error.

Mahsabzg commented 4 months ago

Thank you for your immediate answer. I installed it with:

conda create -n indices_env
activate indices_env
pip install climate-indices
conda install -c conda-forge netcdf4 h5netcdf

I upload my input dataset. first_row.zip

Mahsabzg commented 4 months ago

I also upload my original data and and how to prepare it.
data.zip

cdo mergetime *.nc merge.nc
ncwa -a height
ncks -C -x -v lat
ncks -C -x -v lon
cdo chname,rlat,lat
cdo chname,rlon,lon
ncatted -a units,precipitation,modify,c,'mm'

## bash file 
ncpdq -a lat,lon,time -O 
ncks -A -v lat 
ncks -A -v lon 
ncks -A -v time 
ncks -A -v rotated_pole 
ncks -A -v precipitation 

ncks -O --fix_rec_dmn lat

ncdump -h
keyao22 commented 2 weeks ago

When I applied this to a global scale, the latitude range excluded -90 and 90 degrees.

monocongo commented 2 weeks ago

@keyao22 How the math works for some of the indices precludes the usage of the top/bottom limits, i.e. 90/-90 degrees throws the calculations into infinity etc. Another example is that the Thornthwaite PET can't be computed above or below 60 degrees lat.