ezmsg-org / ezmsg

Pure-Python DAG-based high-performance SHM-backed pub-sub and multi-processing pattern
https://ezmsg.readthedocs.io/en/latest/
MIT License
9 stars 4 forks source link

Support a "labels" field in AxisInfo if present. #123

Closed cboulay closed 3 weeks ago

cboulay commented 1 month ago

ezmsg-lsl has a SpaceAxis that I patch into AxisArray, and SpaceAxis has a labels field. This solves an issues that we discussed in #43

Units that transform the AxisArray in a way that modifies the length of the dimension with the "labels" field, should probably also modify the labels, if present! This PR does that for Units that I'm aware of that modify the shape of the array along axes other than "time".

For affine_transform, it's difficult to know what the new labels should be so in most cases they are just dropped. However, if the affine_transform has rows or columns of zeros then we can get a good guess at thew new labels.

griffinmilsap commented 1 month ago

I didn't know about an lsl connector; this is really cool! When you're ready, let's add it into the README as another extension :D

AxisArray is VERY inspired by XArray (my absolutely favorite data structure for offline analyses) that unfortunately has a little too much going on under the hood to be performant enough for low latency, high-message-rate stream processing. An earlier version of ezmsg-sigproc actually used DataArrays for the messages and that was a lovely developer experience even if my CPUs pegged processing 8 channels of 250 Hz EEG data -- but I digress.

I feel like the general solution to this particular problem is adding a coords implementation into AxisArray that captures some core set of functionality that we find in XArray's DataArray.coords. I wrote the majority of the AxisArray implementation in an afternoon when I was pressured by an incoming deadline. I spent about an hour thinking about how to get coords in without destroying performance or my guiding mantra that "AxisArray is simply a metadata wrapper around a numpy array". I came to the conclusion at that time that a limited coords functionality in AxisArray would be 1. incredibly useful, and 2. performant if we removed all the integrity checks/safety rails that XArray's implementation has. I simply didn't have the bandwidth (at that time) to even do a first-pass implementation.

In summary, AxisArray needs some TLC for

  1. Better Typehinting (even simple fixes like sel/isel returning subclasses of AxisArray, not just AxisArrays)
  2. Better dev experience/consistency (why is concatenate a staticmethod?)
  3. A lightweight coords implementation

Coming back to today and this PR: These operations on coords/labels probably eventually belong in AxisArray helper functions/implementation rather than spread throughout the signal processing modules. Collecting common operations on AxisArrays together in one place would make this whole thing a lot easier to maintain, but if this gets you where you're going today I support a merge after review (feel free to merge it in yourself once my review posts)

cboulay commented 3 weeks ago

Moved to https://github.com/ezmsg-org/ezmsg-sigproc/pull/1