Open MarcoRavich opened 1 year ago
Hi!
No extreme latency optimization tests have been done at this time. CCTV Viewer is specially designed in such a way that there is no output buffering.
If your device has enough resources, the current frame after demultiplexing and decoding is immediately rendered. To some extent, you can influence the demuxing process with AVFormat options: https://ffmpeg.org/ffmpeg-formats.html#Format-Options
For example, the -analyzeduration
-probesize
options are currently used to reduce start delays in stream playback and may not be optimal for your case.
Also at the moment the work on the implementation of hardware decoding acceleration is 60-70% done, but due to the specifics of the application it does not always give a tangible gain in reducing the load on the CPU (However, in some scenarios the result is impressive).
Hi! No extreme latency optimization tests have been done at this time. CCTV Viewer is specially designed in such a way that there is no output buffering. If your device has enough resources, the current frame after demultiplexing and decoding is immediately rendered.
Hi there, thanks for your reply.
I'm not a dev but, if I've understood correctly, this approach seems similar to @free5ty1e's picamframegrid new one (called "DIRECT TO FRAMEBUFFER METHOD" which displays directly without an intermediate file or a need to compile multiple framegrabs into a single image) even if, unfortunately, doesn't give acceptable latencies on Rpbi platform.
That's why we choosed to use an Intel NUC i5 (5th gen) based PC for low latency monitoring.
To some extent, you can influence the demuxing process with AVFormat options: https://ffmpeg.org/ffmpeg-formats.html#Format-Options For example, the
-analyzeduration
-probesize
options are currently used to reduce start delays in stream playback and may not be optimal for your case.We'll certainly play both with these FFMPEG params and cameras' hw-encoders ones to optimize decoding flux in order to obtain lowest possible latencies.
Also at the moment the work on the implementation of hardware decoding acceleration is 60-70% done, but due to the specifics of the application it does not always give a tangible gain in reducing the load on the CPU (However, in some scenarios the result is impressive).
Like many others, we're awaiting for the implementation of GPU video decoding that will certainly help to keep the hardware's working temperatures low.
Thanks in advance for what you're doing.
@free5ty1e's picamframegrid
This is a very very strange, but amasing approach to solving this problem. It looks like a classic attempt to implement a WEB server in Bash as an academic task, which, however, has nothing to do with efficiency.
I meant the absence of video buffering, which is in every player and introduces a significant delay up to several tens of seconds. In addition, the new implementation uses the Zero-copy rendering method whenever possible.
Hi again, during our searches for low latency RTSP "viewers" inside GH, we have found EasyPlayer's repositories - by chinese @tsingsee - that claims to perform hardware decoding on supported platforms (Windows/Android/iOS) with very low delay:
https://github-com.translate.goog/tsingsee/EasyPlayer-RTSP?_x_tr_sl=auto&_x_tr_tl=en
Dunno if this can help or inspire you in any way, but hope so.
note: hoping to do something useful for open software developers, we are colleting/doxing some (re)sources that we'll share under HyMPS \ VIDEO.
Hi there, 1st of all thanks for your cool (voluntary) work !
We finally chosed this software as the only one (among the many others tested) able to provide acceptable latencies for live video monitoring that comes from multiple - GBps LAN-connected - PTZ cameras that we use for live performances streamings.
The Intel NUC (i5-5300U-based) we use is able - inside a clean-installed Mint 21.1/xfce distro - to display 4 x 1080p/25fps streams h264 @ 2mbps/VBR encoded with a latency of less than 0.3 seconds (consuming ~40 % of CPU, depending on scene motions captured by cameras). Since we have to display up to 8 streams (in 2 different cctv-viewer instances, in order to drive the separated outputs/monitors) we would like to know if there are any recomended suggestions, customizations (e.g. custom Kernels) or distributions that we can exploit to keep latencies (and, of course, CPU usage & temps) lowest as possible 'cause operators requires "near-zero latency" to drive PTZ cams correctly.
Thanks in advance !