pycroscopy / sidpy

Python utilities for storing, processing, and visualizing spectroscopic and imaging data
https://pycroscopy.github.io/sidpy/
MIT License
11 stars 14 forks source link

Handling dask array with unknown dimensions #174

Open saimani5 opened 1 year ago

saimani5 commented 1 year ago

When the output shape of an operation is unknown, the output is still a dask array whose shape is treated as nan (not a number). When we try to convert this array of unknown shape into a sidpy dataset, it raises an error.

For example, dset = sid.Dataset.from_array(np.random.rand(4,5)) new_dset = dset[dset<0.5] # The shape of new_dset is unknown until we use .compute() on it.

The shape of new_dset is (nan,) and dset.like_data(new_dset) does not work. This is important when modifying getitem() to always return a sidpy dataset instead of a dask array.