unbuffered reads from ffmpeg

hmaarrfk commented 5 years ago

I don't have time to test and to work on this much, but I think that the code to read in data from ffmpeg can be optimized slightly.

The inefficiencies probably stem from the call to read_n_bytes, which uses reads in a string (an immutable type), then converts to a numpy buffer (by copying the memory).

I found that for my 1048x1328x3 frames, it was able to speed things from reading in a tight loop at 283 fps to 293 fps.

Marginal gain, but maybe somebody needs it. Maybe this can make a bigger difference depending on the workload/hardware/decoding process.

Here is a rough sketch of the patch. You need to set the input to unbuffered see bug report on numpy below.

Patch skeleton

```diff diff --git a/imageio/plugins/ffmpeg.py b/imageio/plugins/ffmpeg.py index 83f9a9f..5d81546 100644 --- a/imageio/plugins/ffmpeg.py +++ b/imageio/plugins/ffmpeg.py @@ -471,7 +471,8 @@ class FfmpegFormat(Format): # For Windows, set `shell=True` in sp.Popen to prevent popup # of a command line window in frozen applications. self._proc = sp.Popen( - cmd, stdin=sp.PIPE, stdout=sp.PIPE, stderr=sp.PIPE, shell=ISWIN + cmd, stdin=sp.PIPE, stdout=sp.PIPE, stderr=sp.PIPE, shell=ISWIN, + bufsize=0, ) # Create thread that keeps reading from stderr @@ -611,7 +612,9 @@ class FfmpegFormat(Format): w, h = self._meta["size"] framesize = self._depth * w * h * self._bytes_per_channel assert self._proc is not None - + s = np.fromfile(self._proc.stdout, dtype=np.uint8, + count=framesize) + return s, True try: # Read framesize bytes if self._frame_catcher: # pragma: no cover - camera thing @@ -644,8 +647,9 @@ class FfmpegFormat(Format): w, h = self._meta["size"] # t0 = time.time() s, is_new = self._read_frame_data() - result = np.frombuffer(s, dtype=self._dtype).copy() - result = result.reshape((h, w, self._depth)) + result = s.reshape(h, w, self._depth) + # result = np.frombuffer(s, dtype=self._dtype).copy() + # result = result.reshape((h, w, self._depth)) # t1 = time.time() # print('etime', t1-t0) ```

I'm not too sure of the other performance implications of buffered vs unbuffered reads. Investigating that will take more time than I have. Maybe other parts of the code can benefit from this kind of stuff.

https://github.com/numpy/numpy/issues/12309

almarklein commented 5 years ago

Note that after imageio/imageio#425 most changes will apply to imageio_ffmpeg. But I can imagine you need some (numpy) magic in imageio as well, so leaving the issue here for now.

almarklein commented 4 years ago

Transferred this issue from imageio to imageio-ffmpeg. Could this be as simple as adding a bufsize arg to our functions?

hmaarrfk commented 4 years ago

I tried a while back. Not too sure why it didn't work.

It may have been something to do with numpy to be honest. I had a pR in flight with them that I never got the tests to pass for

hmaarrfk commented 1 year ago

I'm going to close this as I think that users that require this additional performance from python should likely be looking at new pyav plugin.

imageio / imageio-ffmpeg

unbuffered reads from ffmpeg #32