Closed niksirbi closed 3 weeks ago
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 99.68%. Comparing base (
426003c
) to head (f99d7d8
).
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Thanks for the review @sfmig, I like all you suggestions and will implement them here. The scope of this PR will increase to "refactoring load_poses module" and I will edit the PR title and description accordingly.
Issues
0 New issues
0 Accepted issues
Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code
@sfmig I've updated this PR's description and title. I think there is no need to go line-by-line through the diff again, just let me know whether you agree with the changes as I've described them in the updated PR description.
Looks fantastic @niksirbi 🚀 And also great recap of the renaming bits, thanks! Will merge now
Description
What is this PR
Why is this PR needed?
The public functions in the
load_poses.py
module currently assume that users are always loading data from a file (or from a DeepLabCut-style pandas dataframe). However, there are some use-cases where the data are already in Python, in the form of numpy arrays, perhaps imported with custom loaders (this is not hypothetical, a potential user has already asked for it). There is a way to convert such data into a properly-formattedmovement
dataset, but this way is not easy to find and is not documented.What does this PR do?
Adds a
from_numpy()
function that explicitly acceptsposition
(+ optionalconfidence
) data in the form of numpy arrays and returns amovement
dataset. Under the hood it calls theValidPosesDataset
validator and the existing_from_valid_data()
utility.The addition of this function enabled me to slightly refactor the
load_poses.py
module such thatfrom_numpy()
is the single point-of-entry into amovement
dataset - i.e. every other loading function first reads data into numpy arrays before calling the new function. This was already de facto the case, but it's much more explicit now. Moreover, this refactoring also enabled me to get read of a redundant validation call forLightningPose
data.Here's the schematic of the updated
load_poses.py
module. The previous version can be found here.How has this PR been tested?
I added a simple unit test for the new function. The underlying
ValidPosesDataset
is already extensively tested, and so are all file loaders.Is this a breaking change?
No.
Does this PR require an update to the documentation?
The API index has been updated accordingly. The new function's docstring also includes example usage.
Checklist:
EDIT 2024-05-31
Following @sfmig review, the scope of this PR expanded, resulting in a more thorough refactoring of IO-related modules:
load_poses.py
,save_poses.py
, andvalidators.py
. This mostly involved renaming functions and editing docstrings, to make the whole thing more logical and internally consistent.These are the names of the updated public functions:![Screenshot 2024-05-31 at 15 49 18](https://github.com/neuroinformatics-unit/movement/assets/20923448/da81c943-cb52-4045-af6c-d371b215b1cb)
Note that we renamed
from_dlc_df
tofrom_dlc_style_df
(and likewise for save), because LightningPose also uses "DeepLabCut-style" dataframes.We also decided to rename private functions such that it's clear what is being converted to what, e.g.:
_ds_from_sleap_labels_file()
instead of_load_from_sleap_labels.file()
. There is one remaining inconsistency, namely the fact that public functions start withfrom_
while private functions start with_ds_from_
or_df_from_
. That's because the way public functions are actually invoked is the following:and
load_poses.ds_from_file
would be redundant. Perhaps there is scope for renamingload_poses
toload_dataset
(andsave_poses
tosave_dataset
accordingly), such that the syntax would bemovement.io.load_dataset.from_file()
. That could make more sense now, because "poses" is a bit ambiguous, while we've fully defined what a "dataset" is. I'll open an issue about that.Here's the updated diagram for
movement
's I/O functionalites.