rockchip-linux / mpp

Media Process Platform (MPP) module
466 stars 155 forks source link

Orange pin 5 plus (Realtime HDMI input bad latency) #587

Open N0tiK44 opened 2 months ago

N0tiK44 commented 2 months ago

Hi all

Ive been working on a project for my Orange pi 5 Plus for computer vison and wanted to use the HDMI input for real time video input without network pipelining. from all my testings ive come across 0 solution to get real time sub frame latency, best i could accomplish was 2 seconds at 30fps.

Ive used gstreamer, v4l2 test utility, ffmpeg, mpv, ffplay and even orange pi's test_hdmiin.sh Ive used this repo from nyanmisaka which was the furthest to getting realtime i could get https://github.com/nyanmisaka/ffmpeg-rockchip which uses MPP and RGA my ffmpeg command line was ffmpeg -re -f v4l2 /dev/video20 -c:v hevc_rkmpp -qp_init 10 /root/Desktop/test.mp4

I've used Parsec, Moonlight & Sunshine for low latency gaming which pipelines video via at home network LAN by software to software, id like to be able to plug and play my HDMI cable to the input to process in whatever way possible to achieve desirability for hardware to hardware. i just dont understand how software to software pipeline is better than hardware to hardware pipeline..?

Im currently using Orangepi5plus_1.0.8_ubuntu_jammy_desktop_xfce_linux5.10.160 -- 5.10.160-rockchip-rk3588

Thanks guys, get back to me soon.

below is a example video of what id like to accomplish, this is a latency test using parsec.

https://github.com/rockchip-linux/mpp/assets/141003308/8028764a-4d61-4de0-8cb9-7bc950ba7c27

HermanChen commented 1 month ago

The software codec is purely all run on cpu. The hardare codec need cpu to prepare data for hardware task and handle irq from hardware. So the hardware driver requests certain work flow when using hardware. Back to the latency issue sw decoder can decode the stream on receive and finish decoding on receiving done. But hw decoder need to collect the whole stream which usually can be detected on next frame comes then start the decoding. Here comes a whole frame delay. Also the hw serial decode vs. sw threading parallel decode can have some difference. And when decoding small image the hw driver overhead can be more than the sw decoding time.

N0tiK44 commented 1 month ago

The software codec is purely all run on cpu. The hardare codec need cpu to prepare data for hardware task and handle irq from hardware. So the hardware driver requests certain work flow when using hardware. Back to the latency issue sw decoder can decode the stream on receive and finish decoding on receiving done. But hw decoder need to collect the whole stream which usually can be detected on next frame comes then start the decoding. Here comes a whole frame delay. Also the hw serial decode vs. sw threading parallel decode can have some difference. And when decoding small image the hw driver overhead can be more than the sw decoding time.

Ah right, interesting. Could you steer me into the right direction for how I would accomplish this? or is it not achievable. My ideal video is roughly 2560x1440 at 144hz for the orange pi, or maybe 1920x1080 at 240hz

HermanChen commented 1 month ago
  1. Use small buffer to receive data from network and notify demux to parse data more frequently.
  2. Once demux and parser get a whole frame send it to hw decoder by frame mode and let decoder start working ASAP.
  3. Setup decoder fast_out flag and let it output in decode order which means decode one frame and output one frame. Then get output frame at output port.
  4. Send the frame to display and return it back to deocder once display finished.
N0tiK44 commented 1 month ago
  1. Use small buffer to receive data from network and notify demux to parse data more frequently.
  2. Once demux and parser get a whole frame send it to hw decoder by frame mode and let decoder start working ASAP.
  3. Setup decoder fast_out flag and let it output in decode order which means decode one frame and output one frame. Then get output frame at output port.
  4. Send the frame to display and return it back to deocder once display finished.

Im not sending data through from network. nothing at all requires internet.. i dont think using ffmpeg is feasible, it just doesn't want to cooperate,

legitimatly any method i use other than g streamer is the lowest ive gotten, here is the orange pi 5 gstreamer method

Heres the test_hdmiin.sh file itself

root@orangepi5plus:~# cat /usr/bin/test_hdmiin.sh

!/bin/bash trap 'onCtrlC' INT function onCtrlC () { echo 'Ctrl+C is captured' killall gst-launch-1.0 exit 0 }

device_id=$(v4l2-ctl --list-devices | grep -A1 hdmirx | grep -v hdmirx | awk -F ' ' '{print $NF}') v4l2-ctl -d $device_id --set-dv-bt-timings query 2>&1 > /dev/null width=$(v4l2-ctl -d $device_id --get-dv-timings | grep "Active width" |awk -F ' ' '{print $NF}') heigh=$(v4l2-ctl -d $device_id --get-dv-timings | grep "Active heigh" |awk -F ' ' '{print $NF}')

es8388_card=$(aplay -l | grep "es8388" | cut -d ':' -f 1 | cut -d ' ' -f 2) hdmi0_card=$(aplay -l | grep "hdmi0" | cut -d ':' -f 1 | cut -d ' ' -f 2) hdmi1_card=$(aplay -l | grep "hdmi1" | cut -d ':' -f 1 | cut -d ' ' -f 2) hdmiin_card=$(arecord -l | grep "hdmiin" | cut -d ":" -f 1 | cut -d ' ' -f 2)

if [[ $XDG_SESSION_TYPE == wayland ]]; then DISPLAY=:0.0 gst-launch-1.0 v4l2src device=${device_id} ! videoconvert \ ! videoscale ! video/x-raw,width=1280,height=720 \ ! waylandsink sync=false 2>&1 > /dev/null & else DISPLAY=:0.0 gst-launch-1.0 v4l2src device=${device_id} io-mode=4 ! videoconvert \ ! video/x-raw,format=NV12,width=${width},height=${heigh} \ ! videoscale ! video/x-raw,width=1280,height=720 \ ! autovideosink sync=false 2>&1 > /dev/null &

fi

gst-launch-1.0 alsasrc device=hw:${hdmiin_card},0 ! audioconvert ! audioresample ! queue \ ! tee name=t ! queue ! alsasink device="hw:${hdmi0_card},0" \ t. ! queue ! alsasink device="hw:${hdmi1_card},0" \ t. ! queue ! alsasink device="hw:${es8388_card},0" &

while true do sleep 10 done

N0tiK44 commented 1 month ago

bumped

N0tiK44 commented 1 week ago

Is there literally no support?