neuroinformatics-unit / movement

Python tools for analysing body movements across space and time
http://movement.neuroinformatics.dev
BSD 3-Clause "New" or "Revised" License
96 stars 8 forks source link

Implement I/O for parquet files #307

Open niksirbi opened 2 weeks ago

niksirbi commented 2 weeks ago

Is your feature request related to a problem? Please describe. This is mainly to facilitate data exchange with @roaldarbol's animovement R package, which represents data in a tidy dataframe. Read previous discussion on zulip.

Describe the solution you'd like This would necessitate implementing two new I/O functions:

The tidy dataframe could be a pandas version of the table used as the primary data structure by animovement. After that, we can rely on existing pandas to_parquet and read_parquet methods.

Describe alternatives you've considered We could also consider wrapping the above functions into load_poses.from_animovement_file and to_animovement_file, which will do both things:

This is similar to how we handle DeepLabCut dataframes and files.

Additional context Having the ability to convert movement datasets into 2D "tidy" format unlocks all sorts of new possibilities of saving them to formats optimised for "tables" Having the dataset in this form (where every variable is a columns) also makes it easier to use certain plotting libraries, like seaborn.