neuroinformatics-unit / movement

Python tools for analysing body movements across space and time
http://movement.neuroinformatics.dev
BSD 3-Clause "New" or "Revised" License
77 stars 7 forks source link

Support video and frames files associated with sample data #171

Closed niksirbi closed 1 month ago

niksirbi commented 2 months ago

Description

What is this PR

Why is this PR needed? Having video files, and/or frames extracted from those videos, associated with existing sample pose files will greatly facilitate the development and debugging of GUIs, because it would allow us to plot trajectories over a meaningful background, define ROIs etc. See all the issues linked in References.

What does this PR do?

It overhauls the sample_data.py module to allow for the fetching of videos and/or frames alongside the fetching of pose files. All the changes were done in conjunction with changes to the data repository on GIN and should be interpreted together.

Changes to the data repository:

Changes to the code repository (this PR):

References

Closes #38. Closes #121 because the syntax is much less awkward now (with fewer redundancies), and I think there is no longer a clear need for rewriting the sample_data.py module into a class.

Facilitates #105, #49, #50, #48, #164.

How has this PR been tested?

Updated existing tests in test_sample_data.py.

Is this a breaking change?

Yes, the API for fetching sample datasets has changed. This PR need to be merged ahead of any others, because the changes to the GIN data repository have broken CI, and it will remain broken until this is merged.

Does this PR require an update to the documentation?

Yes, I've updated the relevant sections of the docs.

Checklist:

EDIT 2024-05-07

Following @lochhh 's suggestion, the metadata.yaml has been reformatted as a dict of dicts, using the pose file names as top-level dict keys:

"SLEAP_three-mice_Aeon_proofread.analysis.h5":
  sha256sum: "82ebd281c406a61536092863bc51d1a5c7c10316275119f7daf01c1ff33eac2a"
  source_software: "SLEAP"
  fps: 50
  species: "mouse"
  number_of_individuals: 3
  shared_by:
    name: "Chang Huan Lo"
    affiliation: "Sainsbury Wellcome Centre, UCL"
  frame:
    file_name: "three-mice_Aeon_frame-5sec.png"
    sha256sum: "889e1bbee6cb23eb6d52820748123579acbd0b2a7265cf72a903dabb7fcc3d1a"
  video:
    file_name: "three-mice_Aeon_video.avi"
    sha256sum: "bc7406442c90467f11a982fd6efd85258ec5ec7748228b245caf0358934f0e7d"
  note: "All labels were proofread (user-defined) and can be considered ground truth. It was exported from the .slp file with the same prefix."

This simplifies the logic inside sample_data.py quite a bit!

codecov[bot] commented 2 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 99.67%. Comparing base (a30e796) to head (2551b30).

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #171 +/- ## ======================================= Coverage 99.66% 99.67% ======================================= Files 10 10 Lines 605 619 +14 ======================================= + Hits 603 617 +14 Misses 2 2 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

niksirbi commented 1 month ago

Thanks a lot @lochhh! I like your suggestion to use the pose file names as keys, it indeed simplifies things a lot. I will give it a try.

sonarcloud[bot] commented 1 month ago

Quality Gate Passed Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

niksirbi commented 1 month ago

Thanks a lot @lochhh! I like your suggestion to use the pose file names as keys, it indeed simplifies things a lot. I will give it a try.

I've implemented this and it works! I've added this as an "EDIT" to the PR's description.