vmware / open-vm-tools

Official repository of VMware open-vm-tools project
http://sourceforge.net/projects/open-vm-tools/
2.23k stars 425 forks source link

GNOME desktop freezing on VMware Workstation Pro for Windows host and Fedora guest (likely due to 3D acceleration) #176

Open hermidalc opened 7 years ago

hermidalc commented 7 years ago

Hello VMware Development Team,

I have a new Windows laptop running latest Windows 10.0.15063 and VMware Workstation 12.5.7 build-5813279. It uses hybrid graphics between integrated Intel and Nvidia GPUs using Nvidia Optimus with default settings on a 4K display. VMware runs on the Intel HD 630 GPU (verified by Nvidia GPU activity monitor never showing VMware process). In the BIOS I've set for the Intel GPU to get the max amount dedicated video memory possible (512 MB). I created a new guest running the latest Fedora 25, with 2 GB max guest memory for graphics. open-vm-tools and open-vm-tools-desktop (10.1.5-4.fc25) installed automatically and successfully.

The Gnome desktop will just randomly freeze and I cannot do anything or get control of it without doing a hard reboot of the VM. After a lot of troubleshooting to figure out what is causing the freezing it appears to be due to the 3D acceleration and only in full screen mode. The freezing will occur in both XOrg and Wayland desktop sessions. If I turn off 3D acceleration or not have the VM in full screen (4K) then it doesn't freeze, but this is very inconvenient as Gnome doesn't run well without 3D acceleration.

$ glxinfo | grep OpenGL OpenGL vendor string: VMware, Inc. OpenGL renderer string: Gallium 0.4 on SVGA3D; build: RELEASE; LLVM; OpenGL core profile version string: 3.3 (Core Profile) Mesa 17.0.5 OpenGL core profile shading language version string: 3.30 OpenGL core profile context flags: (none) OpenGL core profile profile mask: core profile OpenGL core profile extensions: OpenGL version string: 3.0 Mesa 17.0.5 OpenGL shading language version string: 1.30 OpenGL context flags: (none) OpenGL extensions: OpenGL ES profile version string: OpenGL ES 3.0 Mesa 17.0.5 OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.00 OpenGL ES profile extensions:

MaZZly commented 7 years ago

Similar build and same problem here: Windows 10 Pro, 1703, Build: 15063.483 Fedora 26.

Running full-screen (4k monitor) seems to bring freeze randomly. Sometimes I can work for an hour in the VM, but most times it freezes after several minutes.

When running windowed mode it doesn't seem to happen.

I haven't tried disabling the 3D acceleration.

MaZZly commented 7 years ago

After testing with 3D acceleration off yesterday I can confirm that no crashes happen during this time. Painfully slow to work with though..

hermidalc commented 7 years ago

I can confirm the same, I've tested both Fedora 25 and 26 identical VM guest installs, using Xorg (not Wayland), and tested using open-vm-tools or proprietary VMWare Tools. GNOME desktops freeze randomly after a few minutes use unless I turn off VMware 3D acceleration which honestly isn't a solution GNOME needs 3D acceleration to be usable.

I also tested Fedora 24 with the same setup and it works and does not freeze.

josepmc commented 6 years ago

This also occurs on Ubuntu 17.04. A workaround is to modify /usr/share/xsessions/gnome.desktop and set Exec=env LIBGL_ALWAYS_SOFTWARE=1 gnome-session --session=gnome

hermidalc commented 6 years ago

Thanks for the additional workaround @josepmc, may I ask though is this any different or better performing than my workaround where you disable VMware 3D hardware acceleration for the VM? Turning off hardware acceleration in GNOME makes it almost unusable.

I hope the open-vm-tools team can figure out why, in Fedora 25 and newer using XOrg (not Wayland) and I assume Ubuntu 17.04 and newer, there is this desktop freezing issue that makes it impossible to have these guest OSs in VMware.

josepmc commented 6 years ago

This does disable it system wide. However you can enable it for other applications by setting the environment variable to 0 for them.

This makes the system go smoother, but it's by far not the desired solution.

On 26 Sep 2017, at 01:00, Leandro Hermida notifications@github.com<mailto:notifications@github.com> wrote:

Thanks for the additional workaround @josepmchttps://github.com/josepmc, may I ask though is this any different or better performing than my workaround where you disable VMware 3D hardware acceleration for the VM? Turning off hardware acceleration in GNOME makes it almost unusable.

I hope the open-vm-tools team can figure out why, in Fedora 25 and newer using XOrg (not Wayland) and I assume Ubuntu 17.04 and newer, there is this desktop freezing issue that makes it impossible to have the guest OSs in VMware.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/vmware/open-vm-tools/issues/176#issuecomment-332037829, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AE2i2k8BptxtEs-e-EGJactKcuZVpVJ9ks5smDCDgaJpZM4OMTxy.

oliverkurth commented 6 years ago

Thanks for reporting. I believe this is an issue with Workstation (I may be wrong), and filed an internal bug.

hermidalc commented 6 years ago

I can also confirm that on the latest VMware Workstation 14 Pro (14.0.0 build-6661328) this major problem still exists. I created a brand new Fedora 26 guest OS using my exact same procedure mentioned above and after using the GNOME desktop for a few minutes in 4k resolution everything freezes.

thomashvmw commented 6 years ago

Hi! I tried to reproduce with an older hybrid graphics laptop but never saw the problem. Not running 4K, though.

Could anyone seeing this problem try one of the following suggestions to help pinpoint the problem:

1) In the laptop bios setup, under Video, turn off Nvidia Optimus and then relaunch win 10 and retry? 2) In the Nvidia control settings, select Nvidia graphics as the system-wide default instead of auto?

Thomas

hermidalc commented 6 years ago

Hi - sorry I should've mentioned this, when troubleshooting I did turn off hybrid graphics in BIOS and play around with Nvidia control settings and it doesn't matter the GNOME desktop will still freeze.

thomashvmw commented 6 years ago

Anyone that could post a vmware.log from a failing VM?

is it possible to ssh to a failing VM and get the output of "dmesg"?

josepmc commented 6 years ago

It does happen under Intel graphics machines too.

On 02 Oct 2017, at 16:06, thomashvmw notifications@github.com<mailto:notifications@github.com> wrote:

Anyone that could post a vmware.log from a failing VM?

is it possible to ssh to a failing VM and get the output of "dmesg"?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/vmware/open-vm-tools/issues/176#issuecomment-333562629, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AE2i2vRIMZrdIxLRl4zWH60vJUAu_Z4wks5soPvVgaJpZM4OMTxy.

hermidalc commented 6 years ago

@thomashvmw I've had a support ticket open with VMware regarding this issue and they had me generate all this extra logging of the failing VM, the logs don't seem to show anything pointing to the source of the problem. They had me make this /etc/vmware-tools/tools.conf file:

log = true

# Enable tools service logging to a file
vmtoolsd.level = debug
vmtoolsd.handler = file
vmtoolsd.data = /tmp/vmtoold.log

# Enable 'vmsvc' service logging to a file
vmsvc.level = debug
vmsvc.handler = file
vmsvc.data = /tmp/vmsvc.log

# Enable VMwareResolutinSet.exe logging to a file
vmresset.level = debug
vmresset.handler = file
vmresset.data = /tmp/vmresset.log

# Enable new "vmusr" service logging to a file
vmusr.level = debug
vmusr.handler = file
vmusr.data = /tmp/vmusr.${USER}.log

# Enable the 'vmvss' snapshot service logging to a file
vmvss.level = debug
vmvss.handler = file
vmvss.data = /tmp/vmvss.log

And give them the logs. About dmesg, I can generate a dmesg output, apologies since I don't know much about it, should I generate it after the guest OS GNOME locks up or anytime?

hermidalc commented 6 years ago

Ok I started up the new Fedora 26 guest OS I made and after a couple minutes as usual GNOME froze, I sshed into it and ran dmesg, here's the output

dmesg.log

Seems like a clue here, line 1715:

[   25.181322] vmtoolsd[2141]: segfault at 14c0 ip 00000000000014c0 sp 00007ffe6c707218 error 14 in vmtoolsd[55b98b6cd000+ac000]
thomashvmw commented 6 years ago

Hi. No, the vmtoolsd error is unrelated. While it's bad, it shouldn't cause a freeze. It would also be helpful if you could find and post the vmware.log file which is located in the vm's folder on the host.

Thanks, Thomas

hermidalc commented 6 years ago

I restarted the VM and ran dmesg again while GNOME is working and unfortunately that message is the last line in the dmesg log after startup:

[   16.373174] vmtoolsd[2152]: segfault at 14c0 ip 00000000000014c0 sp 00007ffde077e368 error 14 in vmtoolsd[5615b79b9000+ac000]

So not sure really if it's a clue anymore

hermidalc commented 6 years ago

Here you go, used the GNOME desktop on restarted VM and let it freeze again

vmware.log

mumblyOMOD commented 6 years ago

I have the same issue. My host is a Windows 10 desktop (ver 10.0.15063) with both a AMD HD 6950 and and AMD RX580 GPU. My guest system is a fully updated Manjaro. I've been running without 3D acceleration for over a year. I'm not sure if anything's changed, but last time I had 3D acceleration enabled the entire Gnome UI would sporadically crash, but I could still get into a terminal using keyboard shortcuts (eg Ctl+Alt+F2). I'm happy to provide any of my configuration files or logs if they'd be of any help.

abecher22 commented 6 years ago

Same problem here. Win10, Workstation 14 Pro, RHEL7 guest. Also, turning off 3D acceleration causes multi-monitor mode to display black screens.

hermidalc commented 6 years ago

@abecher22 I think yours is a different issue. Here we are describing a major issue where GNOME, on guest Linux OSes Fedora 25 and Ubuntu 17.04 and newer, will always freeze if you have VMware 3D acceleration turned on.

By default 3D acceleration should always be on. Most modern Linux graphical desktops need it to function properly that's why it's a major issue for me and others because you effectively cannot use VMware Workstation with any late guest OS.

My workaround is to simply use Fedora 24 as guest OS but eventually that will be a problem because for one reason or another I will need to upgrade. Turning off 3D acceleration is not a workaround the desktop is really unusable if you do that.

I wholeheartedly believe that if the VMware Workstation development team or the open-vm-tools development team took a Windows 10 computer with VMWare Workstation 12.5.x or 14.x and a 4K monitor and installed Fedora 26 guest OS with 3D acceleration turned on they will see this issue within 10-15 minutes of use. The GNOME desktop will always eventually completely freeze.

The reason I ask to use a 4K display is because it seems like it might be that this configuration is causing VMWare Workstation to have problems. I haven't yet seen anyone say this exact issue happens on HD? @mumblyOMOD is yours 4K?

abecher22 commented 6 years ago

@hermidalc I am also experiencing the GNOME freeze. I am using 2 4k monitors. I was just pointing out that when I disable 3D acceleration (which avoids the freeze), I cannot use my dual monitors. I should have made that clear.

mumblyOMOD commented 6 years ago

@hermidalc I generally use a dual monitor setup with both screens at 1920x1200. I think the issue happened to me when only using one monitor, but it's been so long since I've had 3D acceleration enabled that I'd need to test it again to be completely sure.

@abecher22 It might not be of any help to you, but with 3D acceleration disabled I am able to use two of my 1920x1200 monitors without any issues.

abecher22 commented 6 years ago

@mumblyOMOD I cannot use my dual 4k monitors with 3D acceleration turned off. GNOME simply "blacks out". I can use one of the monitors, but not both. With 3D acceleration enabled I can use both 4k monitors but then I have to deal with periodic freezing.

hermidalc commented 6 years ago

Thanks for the follow up and clarification @abecher22 and @mumblyOMOD. More info and data points for the VMware Workstation team to show this is a real problem.

I'm going to hazard a guess and say that this issue has something to do with a flaw in VMware Workstation and its 3D acceleration using late version Linux guest OSes on single 4K and dual HD monitors. Could it be some flaw there isn't enough memory somewhere? My guess is that a single 4K or dual HD displays will require much more memory for 3D acceleration, etc. We are all setting our video memory to the max 2 GB in the settings.

thomashvmw commented 6 years ago

Hi all. Unfortunately we have not been able to reproduce this in-house, so we do need some more help to pinpoint which component is at fault. If you have the possibilty to run any (0-2) of these tests, it would be helpful.

0) If someone has a VM that frequently locks up (Let's call it "Locking"), a) Locate the "Locking" VM folder on the host. b) Edit the "Locking.vmx" file by adding the following two lines at the end: mks.enableSoftwareRenderer=1 mks.enableDX11Renderer=0 This will still have the VM think that 3D is available, but will direct all VM 3D operations to Workstation's own software 3D renderer rather than to the host's DX11 subsystem, so things will be a bit slow... c) Check whether the VM still locks up, and regardless of the result, please post the "vmware.log" file present after the run in the "Locking" VM folder on the host.

1) If someone has a VM the frequently locks up, and finds an error in the "dmesg" guest log or the Xorg log (typically /var/log/Xorg.0.log) in the guest, that would be of great help.

2) If someone has a VM that frequently locks up and can try the exact same VM using a Linux Player or Workstation setup to determine whether there is a lockup also on linux hosts, that would be of great help.

Thanks, Thomas

thomashvmw commented 6 years ago

And another simple test that would also be useful if you're running on a Nvidia system (Optimus disabled if multi-GPUs):

  1. Exactly the same as 0., but replace mks.enableSoftwareRenderer=1 with mks.enableGLRenderer=1 This will direct the VM's 3D operations to the host's Nvidia OpenGL driver.

Thanks, Thomas

mumblyOMOD commented 6 years ago

@thomashvmw I'll try turning on 3D acceleration today and will provide logs for #1 sometime in the next 8 hours or so. After it exhibits the Locking behavior I'll tweak my vmx file according to #0 and will provide an update once I've tested it for a good 8 hours or so (enough time that I know my machine should have exhibited the Locking behavior at least once).

hermidalc commented 6 years ago

@thomashvmw sorry to ask for clarification on your earlier comment, were you able to reproduce the issue in-house or not? Sorry it's unclear from "Unfortunately we have been able to reproduce this in-house"

hermidalc commented 6 years ago

I followed instructions for test 0 above on my test Fedora 26 guest OS. It locked up within 15 seconds after booting, doing what I typically do to test it. I always open up a couple desktop windows, i.e. a terminal and firefox, and then test GNOMEs desktop functionality like alternating between Activities view and the desktop, and switching between desktop and windows. All of these test GNOME's basic 3D capabilities.

dmesg.log vmware.log

thomashvmw commented 6 years ago

@hermidalc, No we have not been able to reproduce internally, unfortunately. I've updated the previous comment

thomashvmw commented 6 years ago

@hermidalc, Looks like your fedora VM is not fully updated. What happens if you do a full update:

sudo dnf update

Even if you don't get rid of the locking, do you get rid of the vmtoolsd segfault in the dmesg.log? Thanks, Thomas

hermidalc commented 6 years ago

@thomashvmw sorry need clarification about my setup.

When I originally purchased and installed VMware Workstation for Windows earlier this summer and saw this GNOME locking issue with Fedora 25 and newer guest OSes, I opened a ticket with VMware and they told me to first install their proprietary VMware Tools and not open-vm-tools. We kept troubleshooting and they couldn't figure out what was causing the locking, which occurs in both VMware Tools and open-vm-tools, though I haven't checked any newer version of open-vm-tools since the summer.

Fully updating my test Fedora 26 guest OS doesn't get rid of the segfault issue on latest proprietary VMware Tools on Workstation 14.0.

thomashvmw commented 6 years ago

@hermidalc, Thanks for the clarification. I don't think this is vm-tools related. More likely a 3D driver issue. Probably the segfault in the log pointed people to vm-tools.

The segfault should be fixed in latest fedora open-vm-tools, but not yet in a released version of proprietary tools, but as mentioned, the segfault fix will probably not fix the lockup.

mumblyOMOD commented 6 years ago

I'm not sure what's changed on my end, but I've had 3D acceleration enabled for 3 days and haven't had my VM lock up at all.

hermidalc commented 6 years ago

To everyone - @MaZZly @mumblyOMOD @abecher22 @josepmc @thomashvmw

I've built a new Fedora 27 guest OS in the exact same manner as the 25 and 26 guest OSes where GNOME locks up but am using the latest open-vm-tools this time (instead of proprietary VMware tools, although as mentioned used both with Fedora 25 and 26 and didn't make a difference regarding locking)

I am currently working with 3D acceleration for over an hour or more with no desktop freezing whatsoever (fingers crossed).

I recommend to others to install the latest Fedora 27 and use open-vm-tools and tell me if the GNOME desktop is freezing or not. My initial guess is that Fedora 27 comes with a new version of GNOME 3.26, so maybe there was something updated that fixes our issue. The reason I think this is because Fedora 24 never locks up, 25 and 26 did, and now 27 seems to work fine.

hermidalc commented 6 years ago

@thomashvmw I also do not see any tools segfault message in dmesg using Fedora 27 guest OS and latest open-vm-tools-10.1.10-3.fc27.x86_64

thomashvmw commented 6 years ago

Hmm. Strange. Perhaps there was an automatic kernel update that fixed this. We've had a bunch of kernel fixes in the pipeline, but none that we know would cause these symptoms. We've tested extensively on a fully updated fedora 26 without being able to repro.

Anyway, If you are running a Wayland-enabled distro, please make sure you're using your distro's latest open-vm-tools. Don't try to use bundled proprietary vm-tools until version 10.2.x is out. There's no gnome-shell/Wayland support yet in bundled vm-tools.

hermidalc commented 6 years ago

@thomashvmw thanks for the reply, I was going to ask about Wayland vs Xorg. I've always been disabling Wayland and using Xorg for GNOME in all Fedora guest OSes I've had 24 - 27, especially due to the locking issue I first tried to stay with the older and more tested Xorg.

What's your recommendation? Is Xorg more stable/robust with VMware Workstation and open-vm-tools or is integration with Wayland just as good now?

hermidalc commented 6 years ago

I was just thinking from what you said, it could've also been the new Linux kernel 4.13.12-300. That is the only other thing that I updated during the time between when GNOME was always locking up and when it stopped.

fs-aikito commented 6 years ago

I'm also hitting this issue with fresh Debian Stretch installation.

Version info:

Guest:

Host:

I'm going to try upgrading kernel to stretch-backports version to see if that helps

edit: Still got freeze with backports kernel 4.14+88~bpo9+1 (4.14.0-0.bpo.2-amd64)

dingotaz commented 6 years ago

Hello.

Fedora 27 fully patched starts up with the wrong screen size (puts scroll-bars on vmware). If you resize smaller it works (sort of). If you resize bigger the whole thing locks up - it doesn't resize and you can't type. The system hasn't frozen as a guest restart works properly and rdp session (through xrdp) still continues to function.

Turning off 3d graphics acceleration fixes this issue and fedora works fine. I had no issues with 3d turned on for the past maybe half dozen releases of fedora with 3D turned on. kernel-4.14.16-300.fc27.x86_64 open-vm-tools-10.2.0-3.fc27.x86_64 gnome-shell-3.26.2-4.fc27.x86_64

thomashvmw commented 6 years ago

@dingotaz, Just updated my Fedora 27 VM. Appears to work just fine. Could you post your Xorg log as well as a dmesg log? Thanks, Thomas.

dingotaz commented 6 years ago

@thomashvmw thanks for getting back to me.

Xorg.0.log Xorg.1.log dmesg.log

Try that. I suspect Xorg.0.log will be the 3D and the .1.log will be with #d acceleration turned off. I basically shut down the machine, turned 3d back on, started up. Workstation had scroll bars and was stuck on the login background picture. When I maximized it, the terminal windows suddenly showed (I have auto-login enabled and to run gnome-terminal). However the keyboard would not be accepted. I then logged in as root through RDP to get the log files.

Now this VM is an upgrade from 26 from 25 from 24. Probably did a fresh install around Fedora 24. At some stage I installed the mesa video driver manually to fix a similar issue, so not sure if that's still hanging around. Not sure how to check that.

thomashvmw commented 6 years ago

Hi! I can't see anything unusual in the logs, except mesa changing hardware surface behind the back of Xorg, which is not good and can definitely explain rendering problems. Could you also post the output of glxinfo, which will show the mesa version. What happens if you run Wayland instead of xorg?

thomashvmw commented 6 years ago

Also, could you boot up an older kernel and see if that changes things? /Thomas

dingotaz commented 6 years ago

@thomashvmw older kernel made no difference. I'll have to remember how to go back to wayland and get back to you. It broke the software I have to use so I turned it off.

glxinfo output file attached, looks like mesa is V17.2.4 mesa.txt

dingotaz commented 6 years ago

Hi @thomashvmw

Enabling wayland made no difference. I did a fresh install of fedora 27 and so far it's working fine. I'll slowly transfer my stuff across see what happens, if it breaks again.

Thanks

mikelars2 commented 6 years ago

Confirmed this happens on Workstation 14.1.1 build-7528167, Ubuntu 18.04 with GNOME, Nvidia GTX 1080, Ryzen 1700X. and Windows 10, 64-bit (Build 16299) 10.0.16299

RyanEwen commented 6 years ago

Seeing the same issue with Ubuntu 18,04 Budgie on 14.1.2 build-8497320.

Every 1-3 days I will see graphical artifacts/weirdness and then immediately CTRL+S my open VS Code instance. This seems to kill the VM and then I can't power it down or do anything with it.

rleigh-codelibre commented 6 years ago

I've seen this issue with an Ubuntu 18.04 host (AMD FX-8350 / AMD Radeon R9 390) and more recently (AMD Ryzen 2700X / AMD Radeon RX 580). Guest are Windows (7/8.1.10), other Linux distributions. I've filed a ticket (18759077303) and like others above, the logs contain zero helpful diagnostic hints. I also have a 4K display, but I think I was suffering with similar issues with Workstation 12 way before upgrading to 4K.

I've tested with KDE plasma, XFCE and i3. The freezes occur in all three environments. With KDE, the compositor / kwin seem to freeze and when it recovers it often thinks a drag event was initiated by dragging a taskbar window; maybe from trying to switch windows during the freeze. This is triggered if vmware workstation is on another virtual desktop, and ceases as soon as all running VMs are stopped. Can sometimes freeze the mouse, but often the cursor moves even while all input (mouse buttons, keyboard) appear to be ignored. Drawing in the guest seems to be a trigger in part; freezes occur much more if there's continuous redrawing, e.g. a progress bar updating, or moving a window, or closing a window, forcing the area to be repainted. But not always; sometimes drawing works without freezing things up.

Like others, I have tried disabling 3D acceleration, reinstalling VMware tools, but nothing has had any meaningful effect.

One other thing which seems to trigger it more frequently: fire up the VM, then switch to another desktop. Same with shutting down.

Another point to consider: I see the freezes frequently when on another desktop with a mail client/web browser/editor/terminal in it. I'm unsure why there would be a lockup of the entire desktop when the compositor has nothing to paint related to vmware on these desktops. Is it purely locking up the X server's painting, or is it also grabbing input focus as well?

Like others, I'm very happy to test and report back logs, but right now dmesg, system logs, Xorg.log and the vmware logs contain zero information. I could potentially try to record a video so you can see the dynamics of the freezing events.