TigerVNC / tigervnc

High performance, multi-platform VNC client and server
https://tigervnc.org
GNU General Public License v2.0
5.24k stars 954 forks source link

Display is not refreshed unless mouse movement over time #1838

Closed joeg1484 closed 1 month ago

joeg1484 commented 1 month ago

Describe the bug Client connected with vncviewer and seems to start fine without issues, however, over time, the connection seems to stop updating the screen unless you move the mouse over the viewer screen. The setup is kiosk mode with Gnome and view only.

We are using it on a local lan, so just a couple switches on a 40g fiber network. We see traffic passing between the systems using tcpdump, but when the system stops updating, all VNC traffic stops - until you move the mouse over the viewer, then it kicks back up with traffic.

We suspected packet drops and tcp re-transmits, but have been assured by network team that there are no errors on the 40g switches or the ports the systems are connected to.

To Reproduce To reproduce, we just have to kill the connection with vncviewer and reconnect. Again, initially it will start fine, but over time will degrade.

The time it takes can be from 5 min to several hours.

Expected behavior We expect the viewing session to continue to update the screen for changes over the course of the connection

Screenshots Unfortunately, this is a classified environment, so I will not be able to provide screen shots, but I can get you the information you need.

Client (please complete the following information):

Server (please complete the following information):

Additional context

Here is the xorg.conf we are using on the server side:

` Section "DRI" Mode 0666 EndSection

Section "ServerLayout" Identifier "TwinLayout" Screen 0 "metaScreen" 0 0 Option "AllowNVIDIAGPUScreens" "true" EndSection

Section "Monitor" Identifier "Monitor0" Option "DPMS" "false" Option "CustomEDID" "DFP:/etc/X11/edids/dell_32edid.bin" EndSection

Section "Monitor" Identifier "Monitor1" Option "DPMS" "false" Option "CustomEDID" "DFP:/etc/X11/edids/extron4k_edid.bin" EndSection

Section "ServerFlags" Option "StandbyTime" "0" Option "SuspendTime" "0" Option "OffTime" "0" Option "BlankTime" "0" EndSection

Section "Device" Identifier "Device0" Driver "nvidia" VendorName "NVIDIA Corporation" BoardName "Quadro RTX 8000" BusID "PCI:33:0:0"

#Option "DamageEvents"  "True"
Option         "UseEDID" "true"
Option         "ConnectedMonitor"   "DFP-0, DFP-1"
Option         "metamodes" "DFP-0: 3840x1440 +0+0, DFP-1: 3840x1440 +2560+0"

EndSection

Section "Screen" Identifier "metaScreen" Device "Device0" Monitor "Monitor0" DefaultDepth 24 SubSection "Display" Modes "3840x2160" EndSubSection EndSection `

This systems is one of about 200 we have in this environment and they are connected by 2 monitors, sometimes 4. This is the only system we are connecting remotely with vnc as a monitor system. We are unable to test on other systems due to the nature of our business.

Please let me know if you need more information and I will provide it.

Thanks!

CendioOssman commented 1 month ago

If it stops updating after a while, isn't it just screen blanking that kicks in?

Can someone have a look at what's happening locally at the machine when things stop? Doesn't it resume if the local mouse is moved?

joeg1484 commented 1 month ago

If it stops updating after a while, isn't it just screen blanking that kicks in?

Can someone have a look at what's happening locally at the machine when things stop? Doesn't it resume if the local mouse is moved?

Thanks for the reply!

So these systems are on 24/7. There is no DPMS or power management enabled on them and screen blanking has been disabled as well in Gnome. And the issue can start in 5 min of running the viewer or 2 hours - its really random... But, the longer it sits, the more chance of it not refreshing the screen.

This particular machine is setup to monitor processes and has a KIOSK desktop setup that is live all the time. The server system is viewed by allot of people, so the screen is never off.

The remote system, is a similar setup where remote viewers are watching the updates that are coming from the server.

I have looked into this and when the screen stops updating on the remote system, I go over the room where the server is at and the screen is still updating as it should.

The screen has about 4 Gnome terminals that have consistent text streaming across the screen as well as a digital clock in the top so people can see what time it is in different time zones. There is also a couple of graphic screens that shows some graphical images so there is a little OpenGL in there, but we are not concerned about performance, just that the text, clock, and graphic show up.

joeg1484 commented 1 month ago

Also, another point is we dont see any errors in any logs when the screen updates appear. there is nothing that is shown in the vncviewer session - we have launched that on the cli to see the out put.

The screen simply stops updating until we move the mouse into the vncview screen and then it starts up again.

I let it sit last week all night and it stopped updating about an hour after I left work and didnt start until I moved the mouse over the screen for vncviewer - total time it was "Froze" was about 13 hours!

joeg1484 commented 1 month ago

Another piece of info I forgot to mention before....

The x0vncserver system (Onsite VNC Server/Client) is displaying a remote desktop using vncviewer, itself... So this server is connected to another remote system (Remote VNC Server) via VNC.

Here is kind of how it looks...

Remote Viewing System using vncviewer to ---> Onsite VNC Server/Client (Via x0vncserver) and is connected to (using vncviewer) ---> Remote VNC Server using vncserver.

The Onsite VNC Server/Client and Remote VNC Server connected using vncviewer -> vncserver never have issues with screen lag or freezing... Just from the Remote Viewing System ---> Onsite VNC Server/Client using x0vncserver.

Not sure how relevant this is because we just want to share the Onsite VNC server1's desktop using x0vncserver to the Remote Viewing System, so it shouldn't matter what that system is displaying on its screen - should it?

Sorry for the confusion.

CendioOssman commented 1 month ago

Okay, so if I understand you correctly, the only application running on the system with x0vncserver is just vncviewer? And that is the machine you quoted the Xorg configuration from?

Are both server systems running GNOME? And are both systems RHEL 8?

You mentioned on #1835 that you had a warning about missing DAMAGE. Is that still the case?

joeg1484 commented 1 month ago

Okay, so if I understand you correctly, the only application running on the system with x0vncserver is just vncviewer? And that is the machine you quoted the Xorg configuration from?

Are both server systems running GNOME? And are both systems RHEL 8?

You mentioned on #1835 that you had a warning about missing DAMAGE. Is that still the case?

Hello again!

Yes, that is all true. The only application on the the system running x0vncserver is vncviewer using GNOME, all the systems involved is RHEL 8.

I have enabled damage on the x0vncserver system from both the xorg documentation as well as the nvidia documentation, but that doesnt seem to work when I launch vncviewer - it still says DAMAGE is not detected.

I also tried lowering the polling rate (PollingCycle) from the default 30 to 25, then 20 to see if that helped, but its doesnt - also lowered the FPS too from the default of 60 to 30.

My understanding of PollingCycle is its a ms time, so lowering it should increase the time it polls, based on the MaxProcessorUsage which I left alone... We are not seeing an increase in CPU usage on the process when we do this, however, so I didnt bother increasing the CPU options. The CPU usage of the process hovers around 10% and the system is a Dell PowerEdge R7525 with 128 CPUs (With HT enabled), so 64 across 2 CPUs.

Thanks!

CendioOssman commented 1 month ago

Since the middle system doesn't seem to require much performance, could you try uninstalling the Nvidia drivers? Let's see if that gets DAMAGE going again.

joeg1484 commented 1 month ago

Since the middle system doesn't seem to require much performance, could you try uninstalling the Nvidia drivers? Let's see if that gets DAMAGE going again.

Yeah I think that should work... Should I simply remove the driver in the device section of the xorg.conf or actually remove the nvidia driver from the system - nvidia uninstall?

Thanks!

CendioOssman commented 1 month ago

A complete uninstallation is likely required to get rid of the kernel and OpenGL drivers.

joeg1484 commented 1 month ago

Hi @CendioOssman So I am unable to remove the nvidia driver at this time. There have been some attention to the system recently and it needs to stay up for now. I will most likely be able to do it first of next week.

However.... I did upgrade the TigerVNC package from 1.12.0-6 to 1.13.1-8 from the RHEL 8 repos and I dont see that message about missing DAMAGE anymore when I start x0vncserver.

There was a BZ report and some mention about libXdamage, libXfixes, and libXrandr in the version 1.12.0-7, so I though maybe a newer version might help.

VNC 1.12.0-7 fixes: 2022-05-31 Jan Grulich jgrulich@redhat.com - 1.12.0-5

- BR: libXdamage, libXfixes, libXrandr
Resolves: bz#2088733

I am unable to log in to view that BZ report for some reason - despite having a RH account LOL, but Im going to roll with this for now and see how things go.

If necessary, I think I will be free to do the nvidia removal next week.

Joe

joeg1484 commented 1 month ago

Hi @CendioOssman , Well, looks like the updated packages fixed the issue. So, this can be closed. Thanks for all help! Joe