Open ddkohler opened 6 years ago
This is an interesting Idea, I want to link this idea to the idea of #9 (Our oldest extant issue)
I'm going to say that this is a relatively low priority task for right now, but good to have some thoughts floating about it.
Some crucial functions of the data object focus on selecting/redistributing our data (e.g.,
split
andchop
).These methods have strong parallels with slicing and advanced indexing of the
numpy.ndarray
class. One feature of ndarray slicing/indexing is that you get views/references of the structured data, not new arrays. Can/should we do the analogous thing when using some of these methods? For example, we could introduce the functionview
:Or something like this. The point is that the data object knows which data we want to see, but doesn't recopy all that data to a new file/object. If we actually wanted to manipulate these subsets of the data, we would copy the data object first (as we would do with subsets of numpy arrays). This would remove the overhead of passing on duplicate axes, variables, channels, and attributes that are highly redundant.
If a data object works like a numpy array, I tend to feel like there are usability/simplicity benefits. Please give thoughts.