Open aaprasad opened 5 months ago
alternatively we should stick to imageio
only and remove the usage of PIL.Image
, skimage.io
because imageio
seems to support tif files. We should also not open files at every __getitem__
call. Just store readers at __init__
and if necessary, add a __del__
function to handle file closing
@talmo thoughts?
Multiple images are now supported as of https://github.com/talmolab/sleap-io/pull/88, so you can do:
import sleap_io as sio
video = sio.load_video(["img1.jpg", "img2.jpg", "img3.jpg"])
The extensions supported are manually whitelisted by backend type (and used to reroute high level API to the appropriate backends):
MediaVideo
: ("mp4", "avi", "mov", "mj2", "mkv")
HDF5Video
: ("h5", "hdf5", "slp")
ImageVideo
: ("png", "jpg", "jpeg", "tif", "tiff", "bmp")
ImageVideo
currently does not support TIFF stacks and it's a bit tricky since you can have single image TIFFs as well. I think the MediaVideo
reader maybe makes more sense for TIFF stacks when using imageio
as the backend, but might get hairy when using the other ones (it falls back to cv2
or pyav
if available).
If you'd like to implement a TIFF stack video backend (or add it to one of the others), this would be a good addition to sleap-io
.
Everything else should be supported though.
Also relevant: https://github.com/talmolab/dreem/pull/64#issuecomment-2244185625
If you want to consider delegating video I/O to
sleap-io
, here's some hints:import sleap_io as sio video = sio.load_video("video.mp4") # opens video handle and keeps it open video = sio.load_video("video.mp4", keep_open=False) # will close the reader after each call to read a frame video = sio.load_video("video.mp4", backend="opencv") # see note below on backends # Explicitly control reader state: video.close() # closes backend but remembers state of the settings (hopefully) video.open() # opens backend reader again, remembering state video.is_open # checks the state if you want to have custom logic (calling open on already open videos is a noop anyway though)
On backend options:
- "opencv": Fastest video reader, but only supports a limited number of codecs and may not be able to read some videos. It requires
opencv-python
to be installed. It is the fastest because it uses the OpenCV C++ library to read videos, but is limited by the version of FFMPEG that was linked into it at build time as well as the OpenCV version used.- "FFMPEG": Slowest, but most reliable. This is the default backend. It requires
imageio-ffmpeg
and affmpeg
executable on the system path (which can be installed via conda). Theimageio
plugin for FFMPEG reads frames into raw bytes which are communicated to Python through STDOUT on a subprocess pipe, which can be slow. However, it is the most reliable and feature-complete. If you install the conda-forge version of ffmpeg, it will be compiled with support for many codecs, including GPU-accelerated codecs like NVDEC for H264 and others.- "pyav": Supports most codecs that FFMPEG does, but not as complete or reliable of an implementation in
imageio
as FFMPEG for some video types. It is faster than FFMPEG because it uses theav
package to read frames directly into numpy arrays in memory without the need for a subprocess pipe. These are Python bindings for the C library libav, which is the same library that FFMPEG uses under the hood.
One more factor: there's WIP over in sleap-nn
on a performant video reader for inference, which we'll upstream to sleap-io
at some point to be reused here.
another thought is to switch over to decord
+ kornia since they read directly into tensors: https://github.com/talmolab/dreem/issues/72
Currently we use
imageio
andskimage.io
for video reading. However,sleap-io
now supports video reading. We should look into swapping out these backends to rely only onsleap-io
where possible. Definitely, we can do this with theSleapDataset
but may need to check ifsleap-io
supports both multi-image and single-image.tif
files.