Problems with hardware transcoding for Raspberry Pi 4

ignaciogarcia commented 1 year ago

I've got go2rtc running on a Raspberry Pi 4, and everything seems to work well except that I'm unable to play content from more than one client at the same time. Each of them plays perfectly if used alone, but once one of them starts playing, the rest can't start getting any video.

I've configured go2rtc to give me debug info and every time I try to play on a second client, it tries to start ("DBG [rtsp] new consumer stream=test") and when I press stop, a disconnect message is printed. But that's it. The clients are varied; two VLC on Mac and Windows and ffplay on Linux. ffplay gives "decode_slice_header error / no frame!" errors, preceded by "non-existing PPS 0 referenced" messages.

I've tried serving a local .mp4 file, a /dev/video0 device, a ffmpeg exec stream, etc. and all of them present the same problem. I've also noticed that if I try to override -rtsp_transport tcp by using an output: entry, or try to force UDP, it says that the protocol is not supported. Maybe it's related to the problem?

Finally, I've tried mediamtx, and the problem does not arise but I'd rather use go2rtc for its flexibility.

My config is as basic as it gets:

streams:
  myexec:
    exec:ffmpeg -re -fflags nobuffer -flags low_delay -i /dev/video0 -vcodec h264_v4l2m2m -b:v 4M -an -f rtsp {output}
  mydevice:
    ffmpeg:device?video=/dev/video0&input_format=bgr24&video_size=1024x1024#video=h264#hardware
  mytest:
    ffmpeg:/home/pi/test.mp4#video=h264#hardware

ffmpeg:
  h264:
    "-vcodec h264_v4l2m2m -b:v 4M -an"

log:
  level: debug  # default level
  exec: debug
  rtsp: debug
  streams: debug

Any ideas?

AlexxIT commented 11 months ago

This is the first time I've heard of such a problem. No one else has problem with multiple viewers for single stream. Maybe you can explain in more detail how you get it.

ignaciogarcia commented 11 months ago

Hi AlexxIT. Thanks responding.

I couldn't find a solution for the problem, so I finally opted for mediamtx and moved on.

Also, you're right; I couldn't see anybody else having my problem. However, I don't know how much more detail I can provide that I haven't above:

go2rtc was the latest version available, downloaded from here. Launched from the console manually.
The config file is provided above.
The platform was a RPi4 8GB with 64bit Raspbian OS.
All three streams in the config file above (myexec, mydevice, mytest) were working without issues on every single client I used (VLC on MacOS and Windows, ffplay on Linux), running on different machines connected to the same LAN/WLAN as the Raspberry, AS LONG as none of the others was already streaming. Only the one that started first got the video stream, the others timed out.
Finally, as I mentioned, in exactly the same scenario I have no issue with mediamtx (same RTSP ports and all). All machines can stream simultaneously.

If nobody but me is having this problem, I'll have to look further in my system and see what's going wrong.

AlexxIT commented 11 months ago

You can't start myexec and mydevice simultaneously. They using same device. Also you have strange ffmpeg settings. You shouldn't use it with hardware flag. It's better to clean your config. Remove all from it. And leave only one stream. Any.

ignaciogarcia commented 11 months ago

Sorry, maybe I was ambiguous. I'm not trying to use all three configured streams myexec, mydevice and mytest at the same time on different clients (which would fail, as you rightly say, since the first two use the same underlying /dev/video0 device).

What I meant is that either of the three streams works correctly as configured, and I can use any single client on any of them with no problem. But if I try to use more than one client on the same stream (i.e. mydevice), only the first one works and the others timeout.

AlexxIT commented 11 months ago

It's better not to change ffmpeg params and not use hardware for first test.

"non-existing PPS 0 referenced" is known error and shouldn't be a problem for ffplay

ignaciogarcia commented 11 months ago

I've started from scratch in a new system to do some more testing, and I'm posting the results just in case they may be of help to someone.

I've downloaded the latest go2rtc binary and I've stripped down the config.yaml file to a bare minimum:

streams:
  mytest:
    ffmpeg:/home/pi/test.mp4#video=h264

log:
  level: debug  # default level
  exec: debug
  rtsp: debug
  streams: debug

With this configuration, it seems to work (alas, with very high CPU usage on the Rpi4) and the second stream starts normally. It does, however, output some error (WRN) on the console sometimes when the second stream is started, but it seems to retry and move on (notice the delay between the connection and the WRN):

17:43:17.381 DBG [rtsp] new consumer stream=mytest
17:43:34.565 WRN github.com/AlexxIT/go2rtc/internal/streams/producer.go:171 > error=EOF url=ffmpeg:/home/pi/test.mp4#video=h264
17:43:34.565 DBG [streams] retry=0 to url=ffmpeg:/home/pi/test.mp4#video=h264
17:43:34.565 DBG [exec] run url="exec:ffmpeg -hide_banner -v error -re -i /home/pi/test.mp4 -c:v libx264 -g 50 -profile:v high -level:v 4.1 -preset:v superfast -tune:v zerolatency -pix_fmt:v yuvj420p -an -user_agent ffmpeg/go2rtc -rtsp_transport tcp -f rtsp {output}"

What I've also found out is that if I add the #hardware tag to the ffmpeg: line, the second stream almost never starts on the first try, and a second try takes some time, then the WRN line above appears and the stream starts playing. Example (again, notice the timestamps):

17:49:47.190 DBG [rtsp] new consumer stream=mytest
17:50:01.733 DBG [rtsp] handle=EOF
17:50:01.733 DBG [rtsp] disconnect stream=mytest
17:50:01.740 DBG [rtsp] new consumer stream=mytest
17:50:08.483 WRN github.com/AlexxIT/go2rtc/internal/streams/producer.go:171 > error=EOF url=ffmpeg:/home/pi/test.mp4#video=h264#hardware
17:50:08.483 DBG [streams] retry=0 to url=ffmpeg:/home/pi/test.mp4#video=h264#hardware
17:50:08.483 DBG [exec] run url="exec:ffmpeg -hide_banner -v error -re -i /home/pi/test.mp4 -c:v h264_v4l2m2m -g 50 -bf 0 -an -user_agent ffmpeg/go2rtc -rtsp_transport tcp -f rtsp {output}"
[...]

Video quality with this configuration is very bad (because of the bitrate), so; finally, if I add the ffmpeg configuration lines I posted in my original file and remove the #hardware flag:

ffmpeg:
  h264:
    "-vcodec h264_v4l2m2m -b:v 4M -an"

the first stream I play doesn't always work on first try. And after it's working, the second stream never works; it times out, and if I try playing it again the server outputs the WRN above, retries, it still doesn't play. Moreover, sometimes when the server retries, the first stream stops working.

In case it's relevant, if the client for the second stream is ffplay, the error messages on the console after connection are as follows (repeated indefinetly):

[h264 @ 0x7fc94c006100] non-existing PPS 0 referenced    0B f=0/0   
    Last message repeated 1 times
[h264 @ 0x7fc94c006100] decode_slice_header error
[h264 @ 0x7fc94c006100] no frame!

BTW, the h264: line above is the one I'm using on mediamtx and it works without any issues. As a matter of fact, compared to the #hardware label it only sets a higher bitrate and doesn't force the GOP size (-bf is already 0 by default).

Hope it helps!

AlexxIT commented 11 months ago

I haven't Pi4 and can't test hardware decoding with it. Also you can stream simple file without any transcoding: mytest: ffmpeg:/home/pi/test.mp4

ignaciogarcia commented 11 months ago

Hello AlexxIT. If you want me to perform some test con RPi4, just ask and I'll try to find time to do it ;)

Regarding the simple file streaming without transcoding; that was only a test to rule out I was doing other things wrong, what I'm actually doing is transcoding a raw video stream coming from an industrial GiG-E camera :-) so transcoding is a must.

Thank you and regards.

AlexxIT commented 11 months ago

I don't believe in transcoding with the Raspberry. I think it's just for decoration. I have a Pi3 and it does a terrible job with transcoding. Doesn't make any sense at all. But can easily handle more than 10 streams without transcoding. In all three of your tests, you use transcoding. You have no examples without transcoding.

ignaciogarcia commented 11 months ago

I can tell you hardware transcoding on the pi4 does work fine using the h264_v4l2m2mcodec, we already finished the application and ffmpeg is able to transcode our 400Mbps (or higher) raw video stream to H.264 with very low CPU usage (around 7% of a single core) compared to struggling with it using the default libx264 software codec. It does run on RPI3 too, btw (alas, with slightly higher CPU usage). You are, however, limited to H.264 and 8192 macro blocks (FullHD resolution).

At the time of my query, it was the RTSP server part the one that we were having problems with. VLC (vlcc, the no GUI version) was already working but performance was not good, so we were going with go2rtc until we started having the described problems.

In all my tests I use transcoding because that's our target application, and it was easier for me to find a reasonable .mp4 test file than a raw video file (even though, for that particular test, I know transcoding wouldn't be necessary). The no-transcoding tests I've run (i.e. myfile: ffmpeg:/home/pi/test.mp4) work just fine. I didn't test that with several streams, though, since I moved on to test with a transcoded stream.

AlexxIT commented 11 months ago

I don't think that I can fix it remotely. I need to have Pi4 for tests locally.

janneman001 commented 7 months ago

I think this is related: https://github.com/jc-kynesim/rpi-ffmpeg/issues/49 I hit this when hw encoding with rpi4 a h.264 stream. I hope it helps. when encoding with libx264 no prob but rpi4 cpu has work twice as hard.

marcusb commented 1 month ago

I think the reason is related to this: https://github.com/raspberrypi/firmware/issues/242

The way that ffmpeg sets up the v4l2m2m encoder, it seems not to include SPS/PPS frames in the stream, only at the beginning of the stream. So any client that connects after the stream has started will not be able to decode it.

marcusb commented 1 month ago

Hi, I have managed to solve this with a patched ffmpeg: https://github.com/RPi-Distro/ffmpeg/pull/10

With this patch, SPS/PPS frames are inserted inline, allowing the client to pick up the decoding within a few seconds (when it receives the first I-frame with the sequence headers).

AlexxIT / go2rtc

Problems with hardware transcoding for Raspberry Pi 4 #641