ros-misc-utilities / ffmpeg_image_transport

ROS2 image transport plugin for encoding/decoding with h264 codec
Apache License 2.0
65 stars 22 forks source link

decoder can introduce additional latencies of multiple frames #25

Closed berndpfrommer closed 7 months ago

berndpfrommer commented 7 months ago

The ffmpeg encoder often does not emit the packet with the encoded image until the next frame or even the frame after that has been fed into the decoder. This means that the compression introduces a lag of 1-2 frames (beyond the lag from the encoding). See this issue filed against the flir camera driver.

berndpfrommer commented 7 months ago

There is some documentation here on how to flush the encoder (send null frame, then drain). This could potentially be used for reducing the latency, but you have to check if you can continue using the encoder after having flushed it. There is also a question if you can do that with all supported encoders. Further, I'd expect there is a reason for the 2-frame buffering, i.e. if you flush after 0 or 1 frames the compression ratio is worse.

buckleytoby commented 7 months ago

This was all done on preset: 'fast' I setup debug mode for the node and inspected some values when a packet succesfully drains. Checking the frameCnt_ value when avcodec_receive_packet start returning 0, it begins at 3 (4 frames in) and then every subsequent frame afterwards. At frame 3, the pk.pts value is 0, meaning the first packet was received from the encoder.

I tested different presets, taken from this list: link Preset: ll first drain frameCnt 3 Preset: p1 first drain frameCnt 3 Tune: any value 1-4 still gets first drain frameCnt_ 3

I also looked at max_bframes, docs ref. `codecContext->max_b_framesyielded0` so I would expect the max delay to be the framerate + 1

I'm unsure if the issue is a settings one or a code one.

buckleytoby commented 7 months ago

Some additional references: from here: _"The code in FFmpeg’s nvenc.c waits until this number of extra frames have been buffered before emitting any and is intended to support parallel encoding. The problem is that its default value is INTMAX which later gets reduced down to the number of NVENC surfaces initialised minus one, which itself has an initial value of 4 in this scenario"

here

and from here -delay <int> E..V...... Delay frame output by the given amount of frames (from 0 to INT_MAX) (default INT_MAX)

I've added setAVOption("delay", "0"); and am testing now

buckleytoby commented 7 months ago

Success! With a delay of 0, the first frameCnt_ is 0, the same as pk.pts.

ros2 topic delay data on a 4k image, preset fast, delay of 0 (master branch): 60 FPS: avg delay: 47 ms (dropping 40% of frames) 40 FPS: avg delay: 41 ms (dropping 8% of frames) 20 FPS: avg delay: 27 ms 10 FPS: avg delay: 27 ms

ros2 topic delay data on a 4k image, preset fast, delay of 0 (cuda_conversion branch): 60 FPS: avg delay: 13 ms (0 dropped frames) 40 FPS: avg delay: 13 ms (0 dropped frames) ...

PR incoming shortly