RPi-Distro / vlc

GNU General Public License v2.0
41 stars 4 forks source link

Are there plans for hardware acceleration on 64-bit kernels? #47

Open Malvineous opened 3 years ago

Malvineous commented 3 years ago

I'm trying to get hardware accelerated video playback working on a Pi4 running a 64-bit kernel, but it doesn't look like there are any options yet.

I have tried VLC as recommended on the omxplayer README, but it uses 75% CPU playing a H264 file which is the same as ffmpeg/ffplay in software mode. The VLC logs don't mention mmal, omx, vc4, v4l2m2m or anything that suggests it is doing hardware decoding, other than it uses OpenGL output.

ffmpeg and ffplay support hardware decoding via h264_v4l2m2m however this only drops the CPU back to 72% use so I was hoping VLC might let me reach close to 0% like omxplayer apparently could.

Are there any plans to make this work on a 64-bit kernel, or am I doing something wrong?

jc-kynesim commented 3 years ago

There are plans, you are not doing anything wrong. The issue is much less decode and a lot more display. 64-bit is KMS only which means that the mmal based display path isn't available anymore and you need that if you have 4k material. Getting the required DRM leases so I can have video+overlays if you have X is proving to be a challenge. If you happen to be an expert on DRM leases I'd love to hear from you :-)

Malvineous commented 3 years ago

Very interesting, thanks for the quick reply! I have no experience with DRM but would be happy to take a look and see how hard it is to learn. Are there any docs on how to find and (cross) compile the relevant code? Is it VLC-specific or would it be usable by other media players too? I seem to have the most success with ffplay so it'd be great if that were usable too.

This is probably a naive question, but is it impractical to implement an X-Video/xv style overlay and just program the VC4 to draw directly to the display where the overlay is? Would that remove the need for DRM, or do you still need it to get the pixel data passed to the xv surface? Sorry if that's a terribly silly question :)

jc-kynesim commented 3 years ago

Heh! If DRM and its interaction with X was well documented (in any place I can find) or easy to understand from the code then we wouldn't be having this conversation.

The primary issue is that getting hold of the relevant handles to let the HVS composite for you is hard and things like X have a hard grip on them. I have manged to persuade X to let go of the primary plane (with xcb_randr_create_lease) but I can't seem to find out how to get at the overlay planes from there (the h/w doesn't work like that but that seems to be that abstraction I am stuck with). In theory Wayland should also dig me out of this problem, but at the moment (in the upcoming bullseye) that only seems to expose RGB planes and does s/w compositing anyway (or so I am told) :-(

jc-kynesim commented 3 years ago

I admit to not having looked at xv. I always assumed that it just did s/w compositing & so was useless to me. If you think it actually does h/w compositing I'll have another look.

Malvineous commented 3 years ago

Oh sorry I meant the code you are working on so I can experiment with it, compile it and test it! But it sounds like there's a bit of a steep learning curve so you may well have it solved by the time I work out how the code works.

I don't know a lot about Xv but it looks like if you're not using a compositing window manager then it uses chroma keying to render the video. So that suggests no software compositing. However the info I found about it says that it is an older method and newer methods of rendering video such as OpenGL are preferable. It depends I guess on how efficient OpenGL would be on the Pi hardware vs chroma-key style rendering.

I saw on the Pi forums that apparently the "OpenGL block" can render YUV data produced by the h264_v4l2m2m decoder, so I'm guessing the DRM problem you're talking about involves getting the data from the M2M decoder and into an OpenGL texture? Or does the DRM stuff sit at a different point in the pipeline?

Malvineous commented 3 years ago

Also I presume you've already asked your questions on the DRM mailing list but didn't get much of a response? https://lists.freedesktop.org/mailman/listinfo/dri-devel

jc-kynesim commented 3 years ago

The GL route sort-of works and is used for display to X when I can. Its limitations are: (1) H/w can't actually render a destination image much bigger than HD @ 60fps so great if you have an HD monitor, not so great for 4k, (2) Doesn't current support SAND format which is what the H265 decode produces, though there is a patch that gets 8-bit SAND working and 10-bit was being worked on though progress reports have gone quiet so I'm not holding out hope. The only route that can display 4k in real time, preferably with an overlay for controls on a Pi4 is direct output to the HVS.

nagualcode commented 2 years ago

if vlc does not support hardware accell, maybe this should be mentioned when suggesting users to migrate from omxplayer....

jc-kynesim commented 2 years ago

The next version of VLC (real soon now) should get you h/w accel where it is supported in most circumstances on 64 and 32 bit. The pathways are always going to be less efficient than those set up by OMXplayer and there are limitations imposed by Linux that mean that some functionality is, at best, hard to achieve so it isn't going to be able to do exactly what OMXplayer did.

parheliamm commented 2 years ago

The next version of VLC (real soon now) should get you h/w accel where it is supported in most circumstances on 64 and 32 bit. The pathways are always going to be less efficient than those set up by OMXplayer and there are limitations imposed by Linux that mean that some functionality is, at best, hard to achieve so it isn't going to be able to do exactly what OMXplayer did.

I am looking forward the new version.

seb3s commented 1 month ago

Hello, does VLC support HW accel on rpi3 hardware ? And is it possible to display a video directly on the hdmi port without any GUI (CLI only) ? Buildroot recently drop support for omxplayer because of ffmpeg upgrade.... (tears dropping on the floor) So I need to find a viable solution soon :-) Cheers

Malvineous commented 1 month ago

My viable solution in the end was to purchase an RPi5 and not bother with hardware acceleration. A 1080p video takes about 22% CPU to decode in software. I run an empty X11 server so there is somewhere to draw the video, and I get a full-screen 1080p video on the HDMI output. I ended up using ffplay, but VLC would probably work fine too. I didn't need its advanced features as this was for a non-interactive security camera live display.

I also can stream two 1080p streams drawn side by side on the same 1080p display, which uses about 45% CPU, all software decoding. Haven't tried 4K yet.

ffplay on an RPi4 also uses about 30% CPU with h264_v4l2m2m HW decoding, but IIRC I could only get one stream decoding, I don't think I could get two playing at the same time (don't quote me on that, I could be misremembering.) I think this also works on a Pi3. I use ffplay -codec:v h264_v4l2m2m. ffplay apparently supports output to an fbdev device but it may be slow, you might have to use X11 to get the OpenGL hardware-accelerated output.

The Pi5 is substantially more powerful than the Pi4 so would recommend the 5 over the 4 for anything video related if possible. The hardware acceleration was always problematic, so not having to deal with it on a Pi5 also makes things significantly easier.

The full command I use (in case you want to test on the Pi3 or benchmark against VLC) is:

ffplay -codec:v h264_v4l2m2m -fflags +nobuffer -i udp://1.2.3.4:5004 -aspect 1920:1200 -vf crop=1583:990:0:90 -f xv -window_x 0 -window_y 0 -window_size 1920x1200

Obviously the crop and scaling sizes would be different in your case, or may be omitted entirely. The window X, Y and sizes are specified because I'm not running a window manager to keep things lean, so I have to manually indicate that the window should take up the whole screen.

popcornmix commented 1 month ago

Hello, does VLC support HW accel on rpi3 hardware ? And is it possible to display a video directly on the hdmi port without any GUI (CLI only) ?

Yes.

seb3s commented 1 month ago

Hi @popcornmix and all, another question : we use the "layer" capability of omxplayer, is there something similar with VLC ? we need to display an overlay (we use elixir scenic project for that) on top of a video running in the background.

jc-kynesim commented 1 month ago

No I'm afraid you are out of luck. The MMAL/OMX interfaces that used to allow direct access to the HVS h/w no longer exist and window/layer generation has to go through "official" interfaces. Here your choices are DRM or Wayland (or X). DRM is good and fast and if could be shared between multiple masters it would do what you want, but it can't - it is single master only so you can have VLC or your other app but not both. Wayland should have the capacity to do what you want but as it stands the passthroughs to create a video layer that is directly rendered by the HVS don't exist (except in the special case of a single full-screen opaque plane - which would obscure your other app). Wayland might plausibly get there in the future, but right now it hasn't.