Closed jonadis closed 2 months ago
I don't know much about frigate. Do you have a direct example of h.264 failing with ffmpeg?
I can include the entire docker startup logs but that's about all I know. 4a4313e55569c4003afeb8ec8996d60b3a5135b990a26e47aae36f360f1ba064-json.log Maybe someone smarter will run into this issue and be of more help. This is the recommended flavor of Ubuntu to run by the Frigate devs so I assume someone else will run into this sooner or later.
It may be smart to open a issue with Frigate just in case, I recently updated the kernel to the latest SDK from Rockchip linux-6.1-stan-rkr3 (source) and will start to look through logs shortly.
Also tagging @nyanmisaka and @hbiyik; see any problems with ffmpeg and the latest kernel SDK? From my initial evaluation I did not encounter any notable issues, so this is interesting.
From the video samples I have, no regression has been found in rkr3. I have been on 6.1.57 and just upgraded to 6.1.76 yesterday. There are many changes in the MPP kernel driver between rkr1~rkr3, and a video sample is needed, otherwise it is difficult to bisect.
ubuntu@ubuntu:~$ uname -a
Linux ubuntu 6.1.0-1019-rockchip #19-Ubuntu SMP Mon Jul 1 12:27:26 UTC 2024 aarch64 aarch64 aarch64 GNU/Linux
ubuntu@ubuntu:~$ ./ffmpeg -hwaccel rkmpp -hwaccel_output_format drm_prime -afbc rga -i ~/jellyfish-120-mbps-4k-uhd-h264.mkv -an -sn -f null -
ffmpeg version 342fe8368c-20240628 Copyright (c) 2000-2024 the FFmpeg developers
built with gcc 14.1.0 (crosstool-NG 1.26.0.93_a87bf7f)
configuration: --prefix=/ffbuild/prefix --pkg-config-flags=--static --pkg-config=pkg-config --cross-prefix=aarch64-ffbuild-linux-gnu- --arch=aarch64 --target-os=linux --enable-gpl --enable-version3 --disable-debug --enable-iconv --enable-zlib --enable-libfreetype --enable-libfribidi --enable-gmp --enable-libxml2 --enable-openssl --enable-fontconfig --enable-libharfbuzz --enable-libvorbis --enable-opencl --enable-libpulse --enable-libvmaf --enable-libxcb --enable-xlib --enable-amf --enable-libaom --enable-libaribb24 --enable-avisynth --enable-chromaprint --enable-libdav1d --disable-libdavs2 --enable-libdvdread --enable-libdvdnav --disable-libfdk-aac --enable-ffnvcodec --enable-cuda-llvm --enable-frei0r --enable-libgme --enable-libkvazaar --enable-libaribcaption --enable-libass --enable-libbluray --enable-libjxl --enable-libmp3lame --enable-libopus --enable-librist --enable-libssh --enable-libtheora --disable-libvpx --enable-libwebp --enable-lv2 --disable-libvpl --enable-openal --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenh264 --enable-libopenjpeg --enable-libopenmpt --enable-librav1e --enable-rkmpp --enable-rkrga --enable-librubberband --disable-schannel --enable-sdl2 --enable-libsoxr --enable-libsrt --enable-libsvtav1 --enable-libtwolame --enable-libuavs3d --enable-libdrm --disable-vaapi --enable-libvidstab --enable-vulkan --enable-libshaderc --enable-libplacebo --enable-libx264 --enable-libx265 --disable-libxavs2 --enable-libxvid --enable-libzimg --enable-libzvbi --extra-cflags=-DLIBTWOLAME_STATIC --extra-cxxflags= --extra-libs='-ldl -lstdc++ -lstdc++ -lgomp' --extra-ldflags=-pthread --extra-ldexeflags=-pie --cc=aarch64-ffbuild-linux-gnu-gcc --cxx=aarch64-ffbuild-linux-gnu-g++ --ar=aarch64-ffbuild-linux-gnu-gcc-ar --ranlib=aarch64-ffbuild-linux-gnu-gcc-ranlib --nm=aarch64-ffbuild-linux-gnu-gcc-nm --extra-version=20240628
libavutil 59. 8.100 / 59. 8.100
libavcodec 61. 3.100 / 61. 3.100
libavformat 61. 1.100 / 61. 1.100
libavdevice 61. 1.100 / 61. 1.100
libavfilter 10. 1.100 / 10. 1.100
libswscale 8. 1.100 / 8. 1.100
libswresample 5. 1.100 / 5. 1.100
libpostproc 58. 1.100 / 58. 1.100
Input #0, matroska,webm, from '/home/ubuntu/jellyfish-120-mbps-4k-uhd-h264.mkv':
Metadata:
encoder : libebml v1.2.0 + libmatroska v1.1.0
creation_time : 2016-02-06T04:01:06.000000Z
Duration: 00:00:30.03, start: 0.000000, bitrate: 120490 kb/s
Stream #0:0(eng): Video: h264 (High), yuv420p(tv, bt709, progressive), 3840x2160 [SAR 1:1 DAR 16:9], 29.97 fps, 29.97 tbr, 1k tbn (default)
rga_api version 1.10.0_[8]
Stream mapping:
Stream #0:0 -> #0:0 (h264 (h264_rkmpp) -> wrapped_avframe (native))
Press [q] to stop, [?] for help
Output #0, null, to 'pipe:':
Metadata:
encoder : Lavf61.1.100
Stream #0:0(eng): Video: wrapped_avframe, drm_prime(tv, bt709, progressive), 3840x2160 [SAR 1:1 DAR 16:9], q=2-31, 200 kb/s, 29.97 fps, 29.97 tbn (default)
Metadata:
encoder : Lavc61.3.100 wrapped_avframe
[out#0/null @ 0xaaab15f6ef90] video:387KiB audio:0KiB subtitle:0KiB other streams:0KiB global headers:0KiB muxing overhead: unknown
frame= 900 fps=248 q=-0.0 Lsize=N/A time=00:00:30.02 bitrate=N/A speed=8.26x
{"log":"2024-07-01 13:15:50.384543912 [ERROR:0@48.137] global cap_ffmpeg_impl.hpp:1309 open Could not open codec h264, error: -11\n","stream":"stdout","time":"2024-07-01T17:15:50.384769741Z"}
https://github.com/blakeblackshear/frigate/discussions/12228 From the log opencv is trying to open h264 software decoder. IIRC the Frigate doesn't use ffmpeg hardware acceleration this way. So I suspect it is not related to rkmpp. Please test rkmpp with ffmpeg and the command in this Wiki.
The ffmpeg tests all seem to work on both kernel versions. In fact I have an ffmpeg task that runs daily that assembles a folder full of JPEGs into a daily timelapse and it works fine on both kernel versions. I have no idea if this has anything to do with this or not but one thing I found that's different is...
localadmin@orangepi5b:~$ uname -r
6.1.0-1016-rockchip
localadmin@orangepi5b:~$ sudo cat /sys/kernel/debug/rknpu/version
RKNPU driver: v0.9.6
localadmin@orangepi5b:~$ uname -r
6.1.0-1019-rockchip
localadmin@orangepi5b:~$ sudo cat /sys/kernel/debug/rknpu/version
RKNPU driver: v0.9.7
According to the frigate docs all that's needed is 0.9.2 or later but perhaps there was some breaking change between 0.9.6 and 0.9.7 that is incompatible with frigate.
At least the ffmpeg test passed to prove that the issue is not related to video hardware acceleration.
@MarcA711 do you know more about the error in frigate?
I just upgraded my system to 6.1.0-1019-rockchip
and I can't reproduce this issue.
@jonadis I see you have lots of cameras and use go2rtc. You have more than 30 ffmpeg processes I think. Maybe this is causing issues. Could you backup your config.yml and maybe start with a fresh one? Just use one reliable camera and try to get detection and recording working. No go2rtc for now. If this works, you can use go2rtc and add more cams (one by one).
No go2rtc and a single camera does indeed seem to work on the latest kernel.
cameras:
front_driveway:
enabled: True
ffmpeg:
inputs:
- path: rtsp://admin:redacted@10.0.71.155:554/Streaming/Channels/102
roles:
- detect
- path: rtsp://admin:redacted@10.0.71.155:554/Streaming/Channels/101
roles:
- record
I'll work on adding back go2rtc and adding cameras back in one at a time. Help me understand what this means? I do indeed have a fair number of ffmpeg processes, but that doesn't pose an issue with the older kernel, only the latest few builds. What does that indicate?
localadmin@orangepi5b:~$ ps -ef | grep -i ffmpeg | wc -l
39
localadmin@orangepi5b:~$ uname -r
6.1.0-1016-rockchip
localadmin@orangepi5b:~$
If a single camera works, can you try adding cameras until you encounter the error? Maybe there is a memory or buffer limit / bug in the new kernel.
no problem here, running 3 cameras and go2RTC configured
When using go2rtc, 4 (main+substream) +1 singlestream cameras (5 total cameras, 9 total streams, 19 total ffmpeg processes) seems to be the limit on 1019 when I add one more camera I start getting these errors:
2024-07-04 15:41:39.198315157 [ERROR:0@19.741] global cap_ffmpeg_impl.hpp:1309 open Could not open codec h264, error: -11
2024-07-04 15:41:39.198328574 [ERROR:0@19.741] global cap_ffmpeg_impl.hpp:1317 open VIDEOIO/FFMPEG: Failed to initialize VideoCapture
2024-07-04 15:41:39.199331037 [ERROR:0@19.742] global cap.cpp:164 open VIDEOIO(CV_IMAGES): raised OpenCV exception:
2024-07-04 15:41:39.199364287
2024-07-04 15:41:39.199369537 OpenCV(4.9.0) /io/opencv/modules/videoio/src/cap_images.cpp:274: error: (-215:Assertion failed) number < max_number in function 'icvExtractPattern'
2024-07-04 15:41:39.199371287
2024-07-04 15:41:39.199373329
2024-07-04 15:41:39.284213708 [ERROR:0@19.827] global cap_ffmpeg_impl.hpp:1309 open Could not open codec h264, error: -11
2024-07-04 15:41:39.284392500 [ERROR:0@19.828] global cap_ffmpeg_impl.hpp:1317 open VIDEOIO/FFMPEG: Failed to initialize VideoCapture
2024-07-04 15:41:39.284950461 [ERROR:0@19.828] global cap.cpp:164 open VIDEOIO(CV_IMAGES): raised OpenCV exception:
2024-07-04 15:41:39.284958628
2024-07-04 15:41:39.284963586 OpenCV(4.9.0) /io/opencv/modules/videoio/src/cap_images.cpp:274: error: (-215:Assertion failed) number < max_number in function 'icvExtractPattern'
2024-07-04 15:41:39.284965336
11 total cameras (20 total streams, 39 total ffmpeg processes) runs fine on the older kernel
Could you post the output of sudo dmesg
after the error occurred?
Here is all the dmesg output after starting frigate ... the error usually occurs within about 15 seconds of starting the container...
[ 159.101269] veth3a5f78c: renamed from eth0
[ 159.114846] br-957798a14698: port 5(veth33596eb) entered disabled state
[ 159.144850] br-957798a14698: port 5(veth33596eb) entered disabled state
[ 159.146684] device veth33596eb left promiscuous mode
[ 159.146699] br-957798a14698: port 5(veth33596eb) entered disabled state
[ 163.210309] br-957798a14698: port 5(vethb0d1d95) entered blocking state
[ 163.210323] br-957798a14698: port 5(vethb0d1d95) entered disabled state
[ 163.210551] device vethb0d1d95 entered promiscuous mode
[ 163.796004] eth0: renamed from veth7d2093f
[ 163.823444] IPv6: ADDRCONF(NETDEV_CHANGE): vethb0d1d95: link becomes ready
[ 163.823603] br-957798a14698: port 5(vethb0d1d95) entered blocking state
[ 163.823618] br-957798a14698: port 5(vethb0d1d95) entered forwarding state
[ 173.929214] cgroup: fork rejected by pids controller in /system.slice/docker-6664804fe39845e2d871c21d0abb967b0716eeeb2365151fa6c1360637384b41.scope
For what its worth, the last line above (the 'cgroup' one) is not present when running the older kernel and Frigate runs happily.
That croup message looks to be the problem. Can you try https://serverfault.com/questions/1032747/cgroup-fork-rejected-by-pids-controller
I edited /usr/lib/systemd/system/user-.slice.d/10-defaults.conf from:
TasksMax=33%
to
TasksMax=infinity
it did not make any difference. I still get the same errors in Frigate, and the same cgroup error in dmesg.
Try to modify DefaultTasksMax=
directive in /etc/systemd/system.conf
to something like 38035 then reboot? Sounds like we are on the right track though.
I left TasxMax
set to infinity
in /usr/lib/systemd/system/user-.slice.d/10-defaults.conf
and also set DefaultTasxMax
in /etc/systemd/system.conf
to 38035 and that seems to have cured it. Can you help an ignorant person understand how this is tied to the kernel upgrade?
I suspect some resource shenanigans going on in the new Rockchip kernel. Needs to be researched further.
If a reason can not be found I may include those tweaks you mentioned in the ubuntu-rockchip-settings package, however this would be a hack and I would like to avoid implementing system wide hacks when possible.
@jonadis can you please provide exact steps so I can reproduce your problem? I think there is a larger issue going on and I need to do some testing.
I experienced this issue as well. I will try to provide more info later.
I'm leaning to implement a user-space hot fix with https://github.com/Joshua-Riek/ubuntu-rockchip/issues/906#issuecomment-2211842830 as a temporary workaround. But I do not want to make it permanent.
@Joshua-Riek not sure if it helps for debugging purposes, but you can start some docker image using docker run --rm -it debian:12 bash
And run this Skript from a bash file to create lots of forks and get the error:
#!/bin/bash
# Number of forks
n=1000
fork_process() {
sleep infinity &
}
for ((i=0; i<n; i++)); do
fork_process
done
wait
I found the problem here https://github.com/Joshua-Riek/ubuntu-rockchip/issues/919#issuecomment-2225695982, the max user processes is very low. I need to figure out what kernel commit caused this change, but nonetheless its progress.
I'm using an orangepi5b with 16GB of RAM to run Frigate 0.14b3. Everything was working fine on 6.1.0-1016 but after upgrading to 6.1.0-1018 frigate would no longer start up correctly saying
in the logs. I was able to boot the older 1016 to confirm that it worked correctly again, so something must have been broken in either 1017 or 1018. I don't know much about kernel development but I do still have both versions 1016 and 1018 installed if there's anything I can gather to help in troubleshooting.