wright-group / WrightTools

Tools for loading, processing, and plotting multidimensional spectroscopy data.
http://wright.tools
MIT License
17 stars 9 forks source link

advanced data views? #697

Open ddkohler opened 6 years ago

ddkohler commented 6 years ago

Some crucial functions of the data object focus on selecting/redistributing our data (e.g., split and chop).

These methods have strong parallels with slicing and advanced indexing of the numpy.ndarray class. One feature of ndarray slicing/indexing is that you get views/references of the structured data, not new arrays. Can/should we do the analogous thing when using some of these methods? For example, we could introduce the function view :

>>> data.axes, data.shape
(<..."d1">, <..."d2"...>, <..."w1"...>, <..."w2"...>), (51, 51, 21, 21)
>>> data.view('w1', 'w2', 'd2', at={["d1": [500, 'fs']]})
>>> data.axes, data.shape
(<..."d2"...>, <..."w1"...>, <..."w2"...>), (51, 51, 21)
>>> data.view('all')
>>> data.axes, data.shape
(<..."d1">, <..."d2"...>, <..."w1"...>, <..."w2"...>), (51, 51, 21, 21)

Or something like this. The point is that the data object knows which data we want to see, but doesn't recopy all that data to a new file/object. If we actually wanted to manipulate these subsets of the data, we would copy the data object first (as we would do with subsets of numpy arrays). This would remove the overhead of passing on duplicate axes, variables, channels, and attributes that are highly redundant.

If a data object works like a numpy array, I tend to feel like there are usability/simplicity benefits. Please give thoughts.

ksunden commented 6 years ago

This is an interesting Idea, I want to link this idea to the idea of #9 (Our oldest extant issue)

I'm going to say that this is a relatively low priority task for right now, but good to have some thoughts floating about it.