RPi-Distro / repo

Issue tracking for the archive.raspberrypi.org repo
37 stars 1 forks source link

VNC connection and vblank wait bug (extreme high application lags) #332

Closed qrp73 closed 7 months ago

qrp73 commented 11 months ago

The bug description

I'm using Raspberry Pi 4B 4G revision 1.5. I don't have hdmi display, so I'm using VNC connection.

I installed the latest 2023-05-03-raspios-bullseye-arm64.img and I catch the issue with many programs like chromium and other open gl based. The problem here is that these programs have extreme lags when you start it from VNC connection.

I tested also 2023-05-03-raspios-bullseye-armhf.img it has the same issue, or even worse.

I investigated the problem and found that if you connect through VNC and there is no physical display connected to rpi4, the video driver uses vblank sync at frame rate 1 Hz for some unknown reason, so all programs which behavior is based on display frame rate have extreme lags.

Current workaround is to run these programs from console with vblank_mode=0, for example:

vblank_mode=0 chromium-browser

it solves the issue and program works without lags.

But starting program from console with adding vblank_mode=0 is a very inconvenient and crooked way.

Could you please fix that bug by adding user setting for default display frame rate for VNC connection? (when physical display is not connected to hdmi) Or at least please replace hardcoded 1 Hz with hardcoded no wait for vblank or at least use hardcoded adequate frame rate, for example 60 Hz. I think the best way is to just skip vblank sync wait when app running through VNC connection and there is no physical display to determine display frame rate.

Steps to reproduce the behaviour

1) Install 2023-05-03-raspios-bullseye-arm64.img or 2023-05-03-raspios-bullseye-armhf.img on sdcard 2) Run RPI4 in headless mode (do not attach display to hdmi) and enable VNC through SSH 3) Connect with VNC client (I'm using tigervnc) 4) Start chromium-browser 5) Notice that it has exterme high lags (freezes for 1-2 seconds every 1-2 sec and on every user action) with almost zero cpu load, and it doesn't allow to see video on yotube.

Actual result: application working with waiting for vblank with refresh rate 1 Hz, as result it leads to extreme high lags in different graphics applications such as chromium-browser.

Expected result: application working with no waiting for vblank at all, or at least using waiting with adequate frame rate, for example 60 fps.

Device (s)

Raspberry Pi 4 Mod. B

System

$ cat /etc/rpi-issue Raspberry Pi reference 2023-05-03 Generated using pi-gen, https://github.com/RPi-Distro/pi-gen, 7c750947a959fb626a70c09fd17c65815df192ac, stage4 $ vcgencmd version Mar 17 2023 10:50:39 Copyright (c) 2012 Broadcom version 82f3750a65fadae9a38077e3c2e217ad158c8d54 (clean) (release) (start) $ uname -a Linux raspi 6.1.21-v8+ #1642 SMP PREEMPT Mon Apr 3 17:24:16 BST 2023 aarch64 GNU/Linux

Logs

$ chromium-browser [144592:144592:0729/070218.521420:ERROR:chrome_browser_cloud_management_controller.cc(162)] Cloud management controller initialization aborted as CBCM is not enabled. MESA-LOADER: failed to retrieve device information MESA-LOADER: failed to retrieve device information MESA-LOADER: failed to retrieve device information [144645:144645:0729/070222.712710:ERROR:sync_control_vsync_provider.cc(139)] Calculated bogus refresh interval=0.987903 s, lasttimebase=66549017266 bogo-microseconds, timebase=66550005169 bogo-microseconds, last_media_streamcounter=66549, media_stream_counter=66550 [144645:144645:0729/070223.731201:ERROR:sync_control_vsync_provider.cc(139)] Calculated bogus refresh interval=1.01908 s, lasttimebase=66550005169 bogo-microseconds, timebase=66551024252 bogo-microseconds, last_media_streamcounter=66550, media_stream_counter=66551 [144645:144645:0729/070224.711158:ERROR:sync_control_vsync_provider.cc(139)] Calculated bogus refresh interval=0.979255 s, lasttimebase=66551024252 bogo-microseconds, timebase=66552003507 bogo-microseconds, last_media_streamcounter=66551, media_stream_counter=66552 [144645:144645:0729/070225.711020:ERROR:sync_control_vsync_provider.cc(139)] Calculated bogus refresh interval=1.00047 s, lasttimebase=66552003507 bogo-microseconds, timebase=66553003974 bogo-microseconds, last_media_streamcounter=66552, media_stream_counter=66553 [144645:144645:0729/070226.711805:ERROR:sync_control_vsync_provider.cc(139)] Calculated bogus refresh interval=0.978528 s, lasttimebase=66553025959 bogo-microseconds, timebase=66554004487 bogo-microseconds, last_media_streamcounter=66553, media_stream_counter=66554 [144645:144645:0729/070227.712375:ERROR:sync_control_vsync_provider.cc(139)] Calculated bogus refresh interval=1.00085 s, lasttimebase=66554004487 bogo-microseconds, timebase=66555005336 bogo-microseconds, last_media_streamcounter=66554, media_stream_counter=66555 [144645:144645:0729/070228.745964:ERROR:sync_control_vsync_provider.cc(139)] Calculated bogus refresh interval=1.03362 s, lasttimebase=66555005336 bogo-microseconds, timebase=66556038960 bogo-microseconds, last_media_streamcounter=66555, media_stream_counter=66556 [144645:144645:0729/070230.712380:ERROR:sync_control_vsync_provider.cc(139)] Calculated bogus refresh interval=0.965384 s, lasttimebase=66557040109 bogo-microseconds, timebase=66558005493 bogo-microseconds, last_media_streamcounter=66557, media_stream_counter=66558

Note

First I submitted this issue to https://github.com/raspberrypi/linux/issues/5556, but they said that this bug is related to RPi-Distro and suggested to recreate the issue here, where the right devs should see it.

Temporary workaround

As temporary workaround I added the following lines into ~/.profile:

# disable vblank sync
export vblank_mode=0

After reboot it allows to connect with VNC and lags disappears. But it requires to remove these lines when you want to use usual display and add it when you want to use VNC.

XECDesign commented 11 months ago

Thanks for the detailed report. I've forwarded it on.

qrp73 commented 11 months ago

just for clarification. vblank sync should be skipped only when app window is displayed on a virtual display in headless mode (when there is no connected display to hdmi port, in other words when there is no hardware source for vblank event and display refresh rate is unknown). When display is connected to hdmi, it already has proper vblank event source and should works as it currently works, in other words with connected display it should wait for hardware vblank event from display (if vblank_mode is not configured to skip it and app didn't requested to skip it through explicit call).

popcornmix commented 11 months ago

Does disabling glamor avoid the slowdown? See this post.

qrp73 commented 11 months ago

I don't know how to check if glamor is enabled or disabled. The option for glamor is missing from raspi-config. It seems that this option is not available on RPI 4.

Here is output from Xorg.log:

cat /var/log/Xorg.0.log | grep glamor
[    20.048] (II) Loading sub module "glamoregl"
[    20.048] (II) LoadModule: "glamoregl"
[    20.048] (II) Loading /usr/lib/xorg/modules/libglamoregl.so
[    20.303] (II) Module glamoregl: vendor="X.Org Foundation"
[    25.537] (II) modeset(0): glamor X acceleration enabled on V3D 4.2
[    25.537] (II) modeset(0): glamor initialized

But I tried workaround mentioned in that post: first I disabled vblank_mode=0 workaround and then replaced if condition with true in /usr/lib/systemd/scripts/glamor_test.sh as mentioned in that post, then reboot. It doesn't help. The lags are still here.

At a glance it looks that both issues may have the same root of cause. It looks that the root of cause is that opengl driver waits for vblank hardware event when hardware display is missing from the system (physically not connected to hdmi connector), so vblank hardware event never happens and that wait breaks after 1 second timeout. As result it leads to render picture at 1 fps frame rate. The proper way is to check if hardware display is connected to hdmi connector and if hardware display is missing then just skip the code for wait vblank event from display.

Where is vblank event wait code is implemented?

qrp73 commented 11 months ago

just found mention for this bug on forums.raspberrypi.com: https://forums.raspberrypi.com/viewtopic.php?t=337926

popcornmix commented 11 months ago

Where is vblank event wait code is implemented?

That is the question. My guess is mesa as I think the issue only occurs for 3d accelerated apps, but there are many layers of graphics and composition (and the kernel) that could have an effect.

We'll try to get a mesa contact to investigate (but they may well come back and say it's not here).

qrp73 commented 11 months ago

it seems that the problems happens here: https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/egl/drivers/dri2/platform_x11.c#L1218

qrp73 commented 10 months ago

I think the good solution for this issue will be to use some virtual display if there is no display connected to rpi and VNC connection is active.

txenoo commented 10 months ago

I can reproduce this issue without using VNC. I think that the root cause for this is having the DPMS enabled on the Xserver. As it detects that there is no monitor it just suspends it, the applications detect that and as nothing is shown on display the vblank is only happening at 1 FPS.

You can check if you have DPMS enabled, here I have it without the HDMI plugged.

pi:~$ DISPLAY=:0.0 xset -q|grep DPMS -A1
DPMS (Energy Star):
  Standby: 600    Suspend: 600    Off: 600
  DPMS is Enabled
  Monitor is Off

If I launch glxgears I get this output.

pi$ DISPLAY=:0.0 glxgears
Running synchronized to the vertical refresh.  The framerate should be
approximately the same as the monitor refresh rate.
5 frames in 5.0 seconds =  1.000 FPS
5 frames in 5.0 seconds =  1.000 FPS
6 frames in 6.0 seconds =  1.000 FPS
5 frames in 5.0 seconds =  1.000 FPS

If you see something like this is that DMPS has set "your monitor off".

If you run:

DISPLAY=:0.0 xset dpms force on

You will see in the terminal you launched glxgears that FPS increase to ~60.

And if you run again:

DISPLAY=:0.0 xset dpms force off

You will see again the 1FPS glxgears.

One solution could be just disable DPMS using raspi-config -> (2 Display Options -> D4 Screen Blanking -> No

I've also needed in the past a HDMI dummy plug to avoid this issue that also happens on headless windows machines.

UPDATE: Note that this solution reduces the CPU/GPU usage, compared to "blank_mode=0". Changing the blank_mode to 0 will force the Mesa driver to render as much frames it can so you could getting to the limit your GPU/CPU for every process using the GPU. Disabling DPMS you are just forcing your CPU/GPU to render the 60FPS that you need.

qrp73 commented 7 months ago

not relevant anymore