Make the hdf5 videos store as int8 format

lambdaloop commented 1 year ago

Description

I am working on a way to read SLP files into a web browser, so that we can have a web annotation tool. I noticed that they almost load with h5wasm, but it breaks due to the object type of the video frames.

I edited the hdf5 video storage a bit to store images in int8 format directly. I do this by padding the encoded images with zeroes so that they all have the same size. This is backwards compatible with the previous loading code (cv2.imdecode handles zeros at the end gracefully).

~~It does make the files 2x bigger. I wonder if we could just enable gzip compression on the frames though?~~ Found from testing that file sizes are actually comparable, I was just not properly erasing the old dataset in my tests. So as far as I can tell, there really are no downsides to this change.

The included tests in tests/io already cover this code change.

Types of changes

[x] Bugfix
[ ] New feature
[ ] Refactor / Code style update (no logical changes)
[ ] Build / CI changes
[ ] Documentation Update
[ ] Other (explain)

Does this address any currently open issues?

None that I know.

Outside contributors checklist

[x] Review the guidelines for contributing to this repository
[x] Read and sign the CLA and add yourself to the authors list
[x] Make sure you are making a pull request against the develop branch (not main). Also you should start your branch off develop
[x] Add tests that prove your fix is effective or that your feature works
[x] Add necessary documentation (if appropriate)

Thank you for contributing to SLEAP!

:heart:

Summary by CodeRabbit

New Feature: Enhanced video handling in the SLEAP application. The update ensures consistent data storage and improved handling of video frames in the HDF5 file format. This results in more reliable video data processing and analysis.
Documentation: Added Lili Karashchuk from the Allen Institute of Neural Dynamics to the list of authors. This recognizes her contributions to the project.

Please note, these changes are under the hood and won't affect the user interface or the way you interact with the software. However, they will contribute to the overall performance and reliability of the application.

coderabbitai[bot] commented 1 year ago

Walkthrough

The changes primarily focus on the to_hdf5 function in sleap/io/video.py. The function now pads frames with zeroes to ensure an int8 data type in the HDF5 file. It also creates a dataset with a shape that accommodates the maximum frame size among all frames. The frames are then stored in the dataset using slicing. Additionally, a new author, Lili Karashchuk, has been added to the AUTHORS file.

Changes

File	Summary
`AUTHORS`	Added Lili Karashchuk from the Allen Institute of Neural Dynamics to the list of authors.
`sleap/io/video.py`	Updated `to_hdf5` function to pad frames with zeroes for `int8` data type consistency in the HDF5 file. The function now creates a dataset with a shape that accommodates the maximum frame size among all frames. The frames are then stored in the dataset using slicing.

🐇💻

In the land of code, where the rabbits play,

Changes were made, in a clever way.

Frames padded with zeroes, in a neat array,

And a new author added, to brighten the day! 🎉

Tips

### Chat with CodeRabbit Bot (`@coderabbitai`) - Mention `@coderabbitai` in any *review comment* for bot assistance. - Note: Review comments are made on code diffs or files, not on the PR overview. ### Pause Incremental Reviews - Insert `@coderabbitai: ignore` in the PR description to halt the bot's ongoing reviews. Remove the line to resume. - Additionally, enter `@coderabbitai pause` to halt the bot's ongoing reviews as any issue or review comment. To resume reviews, enter `@coderabbitai resume` as any issue or review comment.

lambdaloop commented 1 year ago

Here is my signed CLA: sleap-cla-signed-lili.pdf

codecov[bot] commented 1 year ago

Codecov Report

Merging #1559 (95aa144) into develop (1e0627a) will increase coverage by 0.00%. The diff coverage is 100.00%.

@@           Coverage Diff            @@
##           develop    #1559   +/-   ##
========================================
  Coverage    73.37%   73.37%           
========================================
  Files          134      134           
  Lines        23976    23979    +3     
========================================
+ Hits         17592    17595    +3     
  Misses        6384     6384

Files	Coverage Δ
sleap/io/video.py	`92.08% <100.00%> (+0.03%)`	:arrow_up:

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

talmolab / sleap