Closed niksirbi closed 1 year ago
In a discussion with @lochhh, we agreed to first try using the SLEAP data model. The main roadblock for this is SLEAP not supporting recent Python versions. The SLEAP developers suggested using sleap-io, a separate Python package which reimplements their data model and deserialization routines - see this thread.
Just adding some more thoughts on this -- we've gone back and forth a lot on the appropriate data structure for pose data.
SLEAP's object-oriented model is clean and Pythonic (it's basically a bunch of dataclass
es), and maps well onto common serialization formats like JSON/YAML/HDF5. It also makes it easy to translate to standardized formats like NWB's ndx-pose. It's also flexible in that you can have variable numbers of instances per frame, and have the ability to link together attributes like tracks/identities or skeletons with individual animal instances.
The downside is that it's not always the most efficient depending on the access pattern. When you're doing labeling, random access creation of a single point or instance is necessary since users label one animal at a time. But imagine repeated serialization/deserialization -- if you have a Python object for every point, you're going to be instantiating hundreds of thousands to millions of little objects!
When you're doing complex queries, it's super inefficient. Consider the use case where you want to ask for all the frames in which there are N animals with body part pairs A and B within distance K of each other. This now requires a full iteration over all T frames (where T >> 1e6 oftentimes), and every instance within the frame, resulting in a O(T * N) operation -- assuming the labels are stored sequentially and not hashed by something else (like in multi-video projects).
I think the best of both worlds -- and what we'd eventually like to have in sleap-io
-- would be to have a thin object-oriented access layer backed by a pandas DataFrame that has good support for cythonized or otherwise vectorized operations on the backend. Libraries like sqlalchemy achieve this to some extent, allowing for different access patterns via DAO/ORM/CRUD type patterns. Alternatively, just having different backends optimized for different use cases might be cleaner and reduce the abstraction overhead.
If you're going down the object-oriented model route, consider using a framework like attrs or plain dataclasses for readability and reducing boilerplate. See also these considerations with regards to performance and usability: [1] [2] [3]
In any case, give it a go for your test cases, benchmark it, and feel free to reach out if you need any feedback or have any for us!
Thank you for chiming in on this @talmo. Since this project is still in early development, we are fully open to discussing basic design considerations. We want to choose data structures that will not make our lives difficult down the line.
The SLEAP data model appealed to us precisely because of the flexibility you mentioned (and a desire to not reinvent the wheel), but the performance considerations may indeed become a bottleneck. Not so much for our envisioned alpha product (import, smooth and plot tracks) but definitely for more complex kinematic analyses like the example you mentioned.
I am keen to stay in touch and follow the developments over at sleap-io
, given that your team has thought about these issues for much longer than we have.
For now, we will likely try adopting the sleep-io
model as is, and implement changes on the backend as things evolve. If the backend approach you end up with is good enough for our needs, we are happy to adopt it. Otherwise, we'll have to design backends tailored to our needs.
Just out of curiosity, have you given dask much thought? We have benefited from Dask
in other unrelated projects, but haven't yet thought through if/how to apply it to pose data. In case you have considered it and think it's a dead end, let us know.
Also thanks for the attrs
references, I will read through and reconsider my use of Pydantic
.
After some research and internal discussions, we decided to try using xarray.DataArray
as a backend for pose tracking data.
DataArray
is an N-dimensional generalisation of pandas Series
.
values
: a numpy.ndarray
holding the array's valuesdims
: names for each axis (e.g., ['frames', 'individuals', 'bodyparts'])coords
: the levels of each dim
- e.g., list of animal names, bodypart namesattrs
: an OrderedDict
to hold arbitrary metadata (attributes).Multiple DataArray
objects can also be put into an xarray.Dataset
, aligned along shared dimensions. For example we could create a Dataset
corresponding to a collection of videos, with the pose tracks of each video being stored in a separate DataArray
object.
xarray Pros |
xarray Cons |
---|---|
label-based indexing | not as widely known as numpy /pandas |
numpy -like vectorisation and broadcasting |
will require some learning for devs |
pandas -like aggregation + groupby |
|
Dask integration for parallel computing |
I'lle give it a try and see if we can discover some unknown "cons" before we fully commit to it as a backend.
Would definitely recommend xarray
over numpy recarray
. If using this for prediction results only, then this should work great.
If using it for training data, I'd advise checking out some of the discussions in https://github.com/rly/ndx-pose/pull/9 for workflow-specific considerations. Basically, you may not want to over-optimize for timeseries since most annotation for pose is done in single images that are explicitly not consecutive in time.
Would definitely recommend
xarray
over numpyrecarray
. If using this for prediction results only, then this should work great.
Thanks for the input! Most of the things we want to do will operate on the prediction results only. movement
is meant for post-SLEAP/DLC analysis, meaning we use already predicted poses as the input.
Define custom classes for representing points (animal body parts) and series of points (animal trajectories) in space.
These could be sub-classes of
np.record
andnp.recarray
respectively, to access fields (e.g. 'x'. 'y', 'name', 'confidence') as attributes.This is the approach SLEAP takes. We could also directly use or subclass the SLEAP objects.