ros-misc-utilities / ffmpeg_image_transport

ROS2 image transport plugin for encoding/decoding with h264 codec
Apache License 2.0
65 stars 22 forks source link

hevc decoder Could not find ref with POC #26

Closed buckleytoby closed 6 months ago

buckleytoby commented 6 months ago

Using ros2-iron, ubuntu 22.04, hevc_nvenc, branch cuda_conversion, ffmpeg_image_transport encoder

I run rqt_image_view and then the ros2 param set <name_of_your_viewer_node> ffmpeg_image_transport.map.hevc_nvenc hevc as instructed.

When my framerate is 20 FPS then no problems, no artifacts However, once I increase the FPS to 45 then I get the following error:

[hevc @ 0x7f9978a8c9c0] Could not find ref with POC 13
[hevc @ 0x7f9978a8c9c0] Could not find ref with POC 0
[hevc @ 0x7f9978a8c9c0] Could not find ref with POC 3
[hevc @ 0x7f9978a8c9c0] Could not find ref with POC 7
[hevc @ 0x7f9978a8c9c0] Could not find ref with POC 11
[hevc @ 0x7f9978a8c9c0] Could not find ref with POC 1

Along with streaking artifacts in the video stream. According to this post this is due to the decoder buffer filling up faster than the decoder can decode packets.

Is this a limitation of the ffmpeg_decoder?

Not sure how accurate this is, but I ran a profile on a different GPU decoding hw frames and it only took ~1.5 ms per packet so that should be more than enough power to handle 45 FPS.

berndpfrommer commented 6 months ago

Here's another good link related to this: https://github.com/mpv-player/mpv/issues/3440 Not sure how to answer your question if this is a limitation of the ffmpeg decoder. I don't know of any, and 40fps sounds ridiculously low. Could this error message indicate missing packets in general (whether dropped by the decoder b/c it can't keep up, or otherwise)? I don't trust any of the ROS2 transports so the first thing I'd do is check that all the packets make it to the decoder. The ROS sequence number in the message should allow you to detect gaps. Also of course you want to look at the CPU load on the decoder side to rule that out as the problem.

buckleytoby commented 6 months ago

I checked a few things (all on the same pc that's doing the encoding, so no network issues):

  1. I ran ros2 topic echo /leftcam/image_raw/ffmpeg --field pts and confirmed that no packets were missing
  2. checked CPU usage during hevc decoding and hevc_cuvid, and the CPU is not being overloaded
  3. I noticed that if I use the default hevc_cuvid decoder, I don't get any errors in rqt_image_plot but there are still artifacts sometimes. The artifacts and errors only appear once I switch to hevc decoding.

more details: GPU: GeForce RTX 2060 Mobile
NVIDIA-SMI 525.147.05 Driver Version: 525.147.05 CUDA Version: 12.0 40 FPS

berndpfrommer commented 6 months ago

The first check is not too relevant since we'd have to know if packets are dropped at the actual consumer, not with at the "echo" command. Putting a warning message in the decoder plugin whenever there's a gap in sequence number shouldn't be hard to do, no?

Do I understand correctly that "hevc" would be unaccelerated, i.e. on-cpu? Surprising that this does not lead to high CPU load.

You could do some performance tests: decoding with the CLI ffmpeg tool using hevc vs hevc_cuvid to see what the speed difference is. Maybe your machine also supports hevc_v4l2m2m or hevc_qsv .

buckleytoby commented 6 months ago

Turns out it was dropped packets. My decoder PC was using Unity to get the packets through a ROS2 multi-threaded subscriber and save them for the main thread to process. I switched from a trigger system to a queue and I no longer dropped packets in the main thread.

Still not sure why the linux rqt_image_plot using hevc was getting the Could not find ref with POC error but not an issue for me anymore. Thanks!