Open jmdelahanty opened 3 months ago
Thanks a lot for this thorough feature request @jmdelahanty!
Given that movement
is primarily motivated by the needs of neuroscientists, I think what you are asking for is definitely within scope. As a first goal, we should be able to do the "aggregation" by trials purely for behaviour (motion). We are also thinking about integrating this kind of information with neural time-series, but we will probably tackle that later, and potentially relying on existing frameworks like pynapple.
Before we can address the actual "averaging by trials", we'd have to put some infrastructure in place. Currently, movement
has no information about trials/events, so we'd have to come up with a way to define those within our framework. I'd like to avoid re-inventing the wheel, so I'm thinking of closely following how the aforementioned pynapple defines timestamps and intervals. We should also look into NWB
and see how they handle temporal events.
So a rough plan for how to move forward on this would be:
pynapple
, NWB
and other relevant projects handle events (timestamps and intervals) and figure out how much of that we can adopt in movement
.movement
. For example they could be stored in the attrs
dictionary of movement
datasets as metadata, or as a completely separate object.napari
and/or fastplotlib
frontends).movement
dataset (i.e. tracking data from a single video). and a series of trials defined in relation to that dataset's time axis, aggregates the data per trial type. Here, xarray
's groupby and aggregation functionalities should come in handy.This whole plan is likely to take time to do properly (on the scale of many months if I'm being realistic), but the first two points will be key for enabling the whole thing (and a bunch of other related functionalities that involve temporal events). So the first task is the "boring" one: research how this is handled elsewhere and produce a write-up to inform our implementation. I wouldn't want to force that on you, but if you have existing knowledge or opinions on this, feel free to share them.
PS: I'm quite happy you find NeuroBlueprint useful, tagging @JoeZiminski, who co-authored that, so he can be happy as well 😀
Thanks Jeremy, I am happy to hear that!
Is your feature request related to a problem? Please describe. Nope! This is just a kind of metric that could be nice to have for experiments where animals are responding to a stereotypical "trial". It would be helpful to have this information stored in a manner that is accessible for viewing on a subject by subject basis and then for grouping these trials together for looking at averaged statistics across subjects or within subjects across trials.This kind of thing is pretty common in a lot of neuroscience (aligning metrics across animals from a given condition at specific time points). It'd be even cooler if it could be aligned with some other kind of time series information like neural recordings.
Describe the solution you'd like In the short term, having the software align each metric to a trial in an interactive way would be awesome. Once timestamps/trial segmentations are in place, putting them all into a big dataset for performing statistics would hopefully be straight forward. In the long term it would be great for the software to be able to generate averaged plots over whichever subjects were used. Metadata about which subjects were included for each plot would also be cool to have. Following the
NeuroBlueprint
structure would make things like this easier for people if they wanted to use it across subjects. An even longer term goal would be to have interoperability across other methods of storing structured datasets like those collected via tools like autopilot or data within the NWB Standard.Describe alternatives you've considered Although pretty messy, here's a notebook I've put together for trying to do this type of thing. My data is (mostly) structured to follow the NeuroBlueprint structure at the moment. As mentioned here,
movement
already supports many of these things. I have a little video attached that shows the kind of thing I think would be nice to have developed by people who actually know things. So in addition to visualizing each trial for a single animal like this, it would be cool to see the averaged values over time for a subject and then across animal for the study.https://github.com/user-attachments/assets/d86a679f-917a-4b44-9884-74529a164081
Additional context Standardizing the input formats unless things like Autopilot/other well documented factors are in place is going to be insanely hard I think because who knows how people are storing/accessing timestamp information. Not sure how you would handle that.