sakaki- / gentoo-on-rpi-64bit

Bootable 64-bit Gentoo image for the Raspberry Pi4B, 3B & 3B+, with Linux 5.4, OpenRC, Xfce4, VC4/V3D, camera and h/w codec support, weekly-autobuild binhost
GNU General Public License v3.0
921 stars 126 forks source link

low video performance #63

Closed zd59 closed 5 years ago

zd59 commented 5 years ago

Hi I would like to use RPI 3b+ with your gentoo arm64 for a video player. So I tested performance with installed VLC - very poor, as no hardware acceleration. Used OpenGL or automatic at video --> output. VLC plays HD x264 video with a lot of frame drops and all 4 cores at > 95% load. So I checked glxgears, where you noticed:

The kernel and userland are both 64-bit (arm64/aarch64), and support for the Pi's VC4 GPU has been included (using vc4-fkms-v3d / Mesa), so rendering performance is reasonable (e.g., glxgears between 400 and 1200fps, depending on load; real-time video playback)

glxgears with absolutely no load show: 300 frames in 5 seconds, 60 Ffps - that is twenty times less you stated. Why?

Notice: all settings are default including /boot/config.txt and yes I turned off Compositor. /boot/config.txt regarding video and arm64 options at the end of a file

dtoverlay=vc4-fkms-v3d,cma-256
# per https://github.com/anholt/mesa/issues/56#issuecomment-263283300
# gpu_mem is for closed-source driver only; since we are only using the
# open-source driver here, set low
gpu_mem=16

# force 64-bit mode, per https://wiki.gentoo.org/wiki/Raspberry_Pi
arm_control=0x200

I have no explanation for such a low video performance. Do you have any advice?

sakaki- commented 5 years ago

The quoted glxgears performance was for the default window size. If you make the window e.g. fullscreen your performance will be lower, and will depend upon the resolution of your display.

Also, try playing the file with SMPlayer instead - does that show any better performance?

zd59 commented 5 years ago

The resolution of a full screen is 1600 X 1080 and glxgears show a small window with gears. The info below is OK:

zd@pi64 ~ $ glxinfo -B
name of display: :0.0
display: :0  screen: 0
direct rendering: Yes
Extended renderer info (GLX_MESA_query_renderer):
    Vendor: Broadcom (0x14e4)
    Device: VC4 V3D 2.1 (0xffffffff)
    Version: 18.2.4
    Accelerated: yes
    Video memory: 968MB
    Unified memory: yes
    Preferred profile: compat (0x2)
    Max core profile version: 0.0
    Max compat profile version: 2.1
    Max GLES1 profile version: 1.1
    Max GLES[23] profile version: 2.0
OpenGL vendor string: Broadcom
OpenGL renderer string: VC4 V3D 2.1
OpenGL version string: 2.1 Mesa 18.2.4
OpenGL shading language version string: 1.20

OpenGL ES profile version string: OpenGL ES 2.0 Mesa 18.2.4
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 1.0.16

SMPlayer is even worse, the same video is in slow motion with distorted some of frames in a row.

sakaki- commented 5 years ago

Is there an example video file (of the appropriate format) you can point me to online somewhere, so I can download to try to reproduce the playback issues?

zd59 commented 5 years ago

I tested some HD x264 online videos: VLC bad, CPU load almost 100%, dropping frames.. SMplayer much better, CPU load less than 40%.

I wish to watch movies (HD, x264) but none of them could play them. I'm surprised, as pure raspbian compiled for arm6 play those files excellent with VLC-2.2.8 compiled/optimized for GPU of raspberry 3b+ Manual is here: http://www.x90x90x90.com/en/raspberry-pi-3-howto-compile-vlc-with-hardware-acceleration/ And VLC - OpenMAX IL as video output not openGL.

I think, if that works on raspbian, on your Gentoo for arm8 should fly as a rocket.

sakaki- commented 5 years ago

Ah, no, there is an issue with all 64-bit distros at the moment (not just Gentoo), in that MMAL / OpenMAX IL is not supported. If you want hardware decoding and direct overlay rendering for optimal video playback performance, you'd need to use a 32-bit system, as things currently stand.

What you do get with the VC4 support on the gentoo-on-rpi3-64bit image (via Eric Anholt's driver) is a fast (GL) pipeline down which to send (rendered) frames for display (plus 3d acceleration of course); but the initial decoding of those frames currently has to happen in software. See e.g. https://github.com/raspberrypi/firmware/issues/550#issuecomment-190803961 ff, this post etc.

Perhaps a 32-bit system would better suit your use case atm?

zd59 commented 5 years ago

Thank you Sakaki!

So this is maximum possible on that system at the moment. Case closed.