Open ramav87 opened 3 years ago
A method for slicing sidpy datasets that returns sidpy datasets is probably the first order of business @saimani5
I have added the slicing ability ( it's just getitem()) but this breaks many tests, presumably because the original assumption was that slicing returns a dask or numpy array and this does not. A workaround is to define our own index() function to enable indexing. I will explore it.
I have a working branch that enables indexing of sidpy datasets For example: (on rama_dev)
input_spectrum = np.ones([3, 1, 3])
dataset = sid.Dataset.from_array(input_spectrum)
my_dset = new_dataset[0,:]
isinstance(my_dset, sid.Dataset)
I had to change some of the tests for this to work. Some of our code will need to change if we want this, because by default, previously just typing [I,j] would return the value of the array, but now the method calls dask so you must call .compute() to get the values. I think this is worth the tradeoff but am open to suggestions, @gduscher
On that branch, all the tests are passing, for what it's worth.
We require a function that expands and collapses sidpy.dataset objects. For instance, the need to collapse spatial dimensions and/or spectral dimensions when undertaking matrix or tensor factorization, deep learning, etc. It should allow the user to specify which dimensions over which to do the collapsing either by name, index, or dimension type. The details of this collapse should be stored in the dataset as a __ attribute (so it is hidden from the user).