pydata / xarray

N-D labeled arrays and datasets in Python
https://xarray.dev
Apache License 2.0
3.59k stars 1.08k forks source link

Unable to use Xarray to work on RCM Dataset with xsar and safe_rcm by umr-lops #8771

Closed sparshgarg23 closed 8 months ago

sparshgarg23 commented 8 months ago

What happened?

UMR-LOPS has introduced XSAR a library to work with RCM dataset. when working with the following code

import xsar
import geoviews as gv
import holoviews as hv
import geoviews.feature as gf
hv.extension('bokeh')
path = xsar.get_test_file('RCM1_OK1050603_PK1050605_1_SC50MB_20200214_115905_HH_HV_Z010')
meta = xsar.RcmMeta(name=path)
meta.dt

I am encountering the following error

ValueError                                Traceback (most recent call last)
<ipython-input-5-3d49b63ff406> in <cell line: 2>()
      1 #rs2meta = xsar.RadarSat2Meta(name=path)
----> 2 meta = xsar.RcmMeta(name=path)

14 frames
/usr/local/lib/python3.10/dist-packages/xsar/utils.py in wrapper(*args, **kwargs)
     93             startrss = process.memory_info().rss
     94         starttime = time.time()
---> 95         result = f(*args, **kwargs)
     96         endtime = time.time()
     97         if mem_monitor:

/usr/local/lib/python3.10/dist-packages/xsar/rcm_meta.py in __init__(self, name)
     32             self.dt = api.open_rcm(name.split(':')[1])
     33         else:
---> 34             self.dt = api.open_rcm(name)
     35         if not name.startswith('RCM_DS:'):
     36             name = 'RCM_DS:%s:' % name

/usr/local/lib/python3.10/dist-packages/safe_rcm/api.py in open_rcm(url, backend_kwargs, manifest_ignores, **dataset_kwargs)
     95         )
     96 
---> 97     tree = read_product(mapper, "metadata/product.xml")
     98 
     99     calibration_root = "metadata/calibration"

/usr/local/lib/python3.10/dist-packages/safe_rcm/product/reader.py in read_product(mapper, product_path)
    272     }
    273 
--> 274     converted = valmap(
    275         lambda x: execute(**x)(decoded),
    276         layout,

/usr/local/lib/python3.10/dist-packages/toolz/dicttoolz.py in valmap(func, d, factory)
     83     """
     84     rv = factory()
---> 85     rv.update(zip(d.keys(), map(func, d.values())))
     86     return rv
     87 

/usr/local/lib/python3.10/dist-packages/safe_rcm/product/reader.py in <lambda>(x)
    273 
    274     converted = valmap(
--> 275         lambda x: execute(**x)(decoded),
    276         layout,
    277     )

/usr/local/lib/python3.10/dist-packages/toolz/functoolz.py in __call__(self, *args, **kwargs)
    302     def __call__(self, *args, **kwargs):
    303         try:
--> 304             return self._partial(*args, **kwargs)
    305         except TypeError as exc:
    306             if self._should_curry(args, kwargs, exc):

/usr/local/lib/python3.10/dist-packages/safe_rcm/product/reader.py in execute(mapping, f, path)
     29     subset = query(path, mapping)
     30 
---> 31     return compose_left(f, attach_path(path=path))(subset)
     32 
     33 

/usr/local/lib/python3.10/dist-packages/toolz/functoolz.py in __call__(self, *args, **kwargs)
    485 
    486     def __call__(self, *args, **kwargs):
--> 487         ret = self.first(*args, **kwargs)
    488         for f in self.funcs:
    489             ret = f(ret)

/usr/local/lib/python3.10/dist-packages/toolz/functoolz.py in __call__(self, *args, **kwargs)
    487         ret = self.first(*args, **kwargs)
    488         for f in self.funcs:
--> 489             ret = f(ret)
    490         return ret
    491 

/usr/local/lib/python3.10/dist-packages/safe_rcm/product/reader.py in <lambda>(obj)
    126                 ),
    127                 lambda obj: obj.set_index({"stacked": ["pole", "pulse"]}),
--> 128                 lambda obj: obj.unstack("stacked"),
    129             ),
    130         },

/usr/local/lib/python3.10/dist-packages/xarray/util/deprecation_helpers.py in inner(*args, **kwargs)
    113                 return func(*args[:-n_extra_args], **kwargs)
    114 
--> 115             return func(*args, **kwargs)
    116 
    117         return inner

/usr/local/lib/python3.10/dist-packages/xarray/core/dataset.py in unstack(self, dim, fill_value, sparse)
   5576                 )
   5577             else:
-> 5578                 result = result._unstack_once(d, stacked_indexes[d], fill_value, sparse)
   5579         return result
   5580 

/usr/local/lib/python3.10/dist-packages/xarray/core/dataset.py in _unstack_once(self, dim, index_and_vars, fill_value, sparse)
   5395         indexes = {k: v for k, v in self._indexes.items() if k != dim}
   5396 
-> 5397         new_indexes, clean_index = index.unstack()
   5398         indexes.update(new_indexes)
   5399 

/usr/local/lib/python3.10/dist-packages/xarray/core/indexes.py in unstack(self)
   1019 
   1020         if not clean_index.is_unique:
-> 1021             raise ValueError(
   1022                 "Cannot unstack MultiIndex containing duplicates. Make sure entries "
   1023                 f"are unique, e.g., by  calling ``.drop_duplicates('{self.dim}')``, "

ValueError: Cannot unstack MultiIndex containing duplicates. Make sure entries are unique, e.g., by  calling ``.drop_duplicates('stacked')``, before unstacking.

As you can see from the last sections in the trace,the issue is with xarray/dataset.py when we unstack the dataframe. Any ideas why this is happening.The issue doesn't occur with radarsat 2 or any other dataset.So is this an xarray problem or should I raise the issue at umr-lops?

What did you expect to happen?

the error shouldn't be there,and I should be able to view the dataframe. as shown in below link https://cyclobs.ifremer.fr/static/sarwing_datarmor/xsar/examples/rcm.html

Minimal Complete Verifiable Example

import xsar
import geoviews as gv
import holoviews as hv
import geoviews.feature as gf
hv.extension('bokeh')
path = xsar.get_test_file('RCM1_OK1050603_PK1050605_1_SC50MB_20200214_115905_HH_HV_Z010')
meta = xsar.RcmMeta(name=path)
meta.dt

MVCE confirmation

Relevant log output

No response

Anything else we need to know?

No response

Environment

commit: None python: 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] python-bits: 64 OS: Linux OS-release: 6.1.58+ machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_US.UTF-8 LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.2 libnetcdf: None xarray: 2023.7.0 pandas: 1.5.3 numpy: 1.25.2 scipy: 1.11.4 netCDF4: None pydap: None h5netcdf: 1.3.0 h5py: 3.9.0 Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None iris: None bottleneck: None dask: 2023.8.1 distributed: 2023.8.1 matplotlib: 3.7.1 cartopy: None seaborn: 0.13.1 numbagg: None fsspec: 2023.6.0 cupy: None pint: None sparse: None flox: None numpy_groupies: None setuptools: 67.7.2 pip: 23.1.2 conda: None pytest: 7.4.4 mypy: None IPython: 7.34.0 sphinx: 5.0.2 /usr/local/lib/python3.10/dist-packages/_distutils_hack/__init__.py:33: UserWarning: Setuptools is replacing distutils. warnings.warn("Setuptools is replacing distutils.")
welcome[bot] commented 8 months ago

Thanks for opening your first issue here at xarray! Be sure to follow the issue template! If you have an idea for a solution, we would really welcome a Pull Request with proposed changes. See the Contributing Guide for more. It may take us a while to respond here, but we really value your contribution. Contributors like you help make xarray better. Thank you!

keewis commented 8 months ago

is this the same error as umr-lops/xarray-safe-rcm#75? If so, let's discuss that there.

sparshgarg23 commented 8 months ago

@keewis yes it's the same issue,I was able to trace the error.But I am not sure if it's because of xarray or some other library UMR-LOPS is using ,which is causing this error.

mathause commented 8 months ago

Thanks for your report. It's due to a change in xarray but has to be fixed downstream - see https://github.com/umr-lops/xarray-safe-rcm/issues/75#issuecomment-1953510785. I'll close here for now.