gcanat / video_reader-rs

A library to fastly decode video with ffmpeg and rust
MIT License
27 stars 4 forks source link

Improve Support for Iterative Reading of High-Res Video #16

Open andresenwc opened 1 month ago

andresenwc commented 1 month ago

This package has been useful to me already. Thank you very much for your work!

My particular use case involves iterative processing of high resolution video.

Decord fails for me here, because of the apparent memory leak described here, which ultimately lead me to explore this library as an alternative.

Unfortunately, this library falls short for me as well because this sort of reading seems to be quite slow (much slower than Decord). I noted the discussion in #5, and in particular the approach outlined in this comment outlining necessary development to support this use case.

I wanted to follow up on any plans to support this use case, and at the very least document / track it in this issue.

gcanat commented 1 month ago

Hello, thank you for you interest in this library. Have you tried using the decode() function instead ? You can specify start_frame and end_frame. Something like this:

import video_reader as vr
from tqdm import tqdm

filename = "/path/to/file.mp4"
chunk_size = 500
(n, h ,w) = vr.get_shape(filename)

for i in tqdm(range(0, n, chunk_size)):
    end = min(i + chunk_size, n)
    frames = vr.decode(filename, start_frame=i, end_frame=end)
    # do whatever you need with `frames` afterwards

It is still not as fast as it could be, but should be a big improvement compared to using get_batch()

Also it is worth noting that the main bottleneck with this use case (ie High Res videos) is the part of converting the raw frame into an ndarray. You might be interested to try the last snippet of code here described in the discussion #5

NevermindNilas commented 1 month ago

This package has been useful to me already. Thank you very much for your work!

My particular use case involves iterative processing of high resolution video.

Decord fails for me here, because of the apparent memory leak described here, which ultimately lead me to explore this library as an alternative.

Unfortunately, this library falls short for me as well because this sort of reading seems to be quite slow (much slower than Decord). I noted the discussion in #5, and in particular the approach outlined in this comment outlining necessary development to support this use case.

I wanted to follow up on any plans to support this use case, and at the very least document / track it in this issue.

For simple iterative stuff, something like this would better suit the use case. For now I've decided to stick with simply calling FFMPEG through a subprocess due to the amount of flexibility one can get through it, though eventually, as soon as the bindings are done I will potentially switch as well.

gcanat commented 1 month ago

Might be worth mentioning that this ffmpy lib seems to be Windows only for now.

gcanat commented 2 weeks ago

Hello, I've pushed some interesting update that should improve decoding speed by 2.5x to 3x. pip install --upgrade video-reader-rs and try to replace vr.decode() with vr.decode_fast() in the above code snippet. Please note that the object returned by decode_fast() is a list of np.ndarray, ie a list of frames with shape (H, W, C), whereas decode() returns an np.ndarray with shape (N, H, W, C).

Let me know how it works out for you ;-)

NevermindNilas commented 2 weeks ago

Hello, I've pushed some interesting update that should improve decoding speed by 2.5x to 3x. pip install --upgrade video-reader-rs and try to replace vr.decode() with vr.decode_fast() in the above code snippet. Please note that the object returned by decode_fast() is a list of np.ndarray, ie a list of frames with shape (H, W, C), whereas decode() returns an np.ndarray with shape (N, H, W, C).

Let me know how it works out for you ;-)

import video_reader as vr
from tqdm import tqdm

filename = r"C:\Users\User\Downloads\file_example_MP4_1920_18MG.mp4"
chunk_size = 500
(n, h, w) = vr.get_shape(filename)

for i in tqdm(range(0, n, chunk_size)):
    end = min(i + chunk_size, n)
    frames = vr.decode_fast(filename, start_frame=i, end_frame=end)

Looks like DLL Load is still an issue.

gcanat commented 2 weeks ago

I dont have a windows machine so I cannot test. I would suggest to try and build from source:

  1. Activate a python virtualenv
  2. pip install maturin
  3. git clone https://github.com/gcanat/video_reader-rs.git
  4. inside video_reader-rs directory: maturin develop -r