Open hermidalc opened 2 years ago
NVMe storage issues are not related to open-vm-tools; Please open a service/support request with VMware support to diagnose the issue further.
NVMe storage issues are not related to open-vm-tools; Please open a service/support request with VMware support to diagnose the issue further.
VMware support is generally a big waste of time for general software issues, because they will think it’s a problem with your specific computer or setup and not a general issue with the software.
They also don’t give different expedited support to power users who know it’s not a problem with their computer or setup. I didn’t have this issue historically on the same computer, same NVMe drive, and same Windows install. I’ve seen this issue posted by many other users online. That’s why I don’t bother with VMware Support because I don’t want to spend cycles proving to them that it’s an issue on their side not mine.
We have informed the product team responsible for this issue.
As this is not related to open-vm-tools, please engage with the VMware Workstation community (https://communities.vmware.com/t5/VMware-Workstation/ct-p/3019-home) or VMware support service for further updates on this issue (the recommendation here).
General guess, based on the messages seen: I/O is slow or stuck, or an interrupt was missed. The abort succeeds and appears to clear the condition for a short time. All things pointing to storage issues.
As for the information likely to be needed by support: Guest OS:
Host OS:
VMware Workstation:
As an aside, the VMTN for Workstation has/had a similar issue thread, might be of help to you: https://communities.vmware.com/t5/VMware-Workstation-Pro/Workstation-Pro-16-NVMe-controller/m-p/2822786#M168257
As an aside, the VMTN for Workstation has/had a similar issue thread, might be of help to you: https://communities.vmware.com/t5/VMware-Workstation-Pro/Workstation-Pro-16-NVMe-controller/m-p/2822786#M168257
Exactly, I already saw this and multiple other threads online of users having the same problem. That's why I don't want to deal with VMware support because they will make me go through wasted time and cycles assuming first it's specific to my box and setup when clearly it's not it's a general VMware issue.
Summary from the issue thread above - something is generally going wrong with the VMware Workstation 16 virtual NVMe adapter and it needs to be fixed.
Hello. I am not using VMware, but I have this issue on ubuntu with the latest mainline kernels.
I am using 5.19 and have that issue from 5.19.1 to the now recent, and used by me, 5.19.11 from ubuntu 22.04 mainline.
nvme nvme0: Abort status: 0x0 nvme nvme0: I/O 14 QID 2 timeout, aborting nvme nvme0: Abort status: 0x0 nvme nvme0: I/O 62 QID 2 timeout, aborting and so on...
I have these issues after I changed from a AMD 2400G with an AM3 AGESA (on X370 chipset) before 1.0.0.6 to a AMD 5600G on a AM3 with AGESA 1.2.0.7 (on A520 chipset). NVME drive is the same.
I also get an error message at bootup from the NVME: Device: /dev/nvme0, number of Error Log entries increased from 203 to 206 This counter rises +1 every poweroff (203-206 comes from an image backup I did with 3 manual power off's, so it seems every power off counts as one error count).
The NVME drive never breached the high temp count and shows zero errors and very little wear on SMART tests, since I use this drive for a simple "daily use" multimedia system.
When the "nvme nvme0: Abort status: 0x0" / "nvme nvme0: I/O 14 QID 2 timeout, aborting" errors occur, the system hangs for a while, no I/O operations get processed for up to 30 seconds.
I found an entry is from 2020 and it seems to be still an issue in late 2022, is there a global fix or a planned fix for the user or planned to kernel changes? ( https://github.com/clearlinux/distribution/issues/2121 )
Thank you for any reply and/or ideas.
@RevAngel7 From your statement
I am not using VMware, but I have this issue on ubuntu with the latest mainline kernels.
it suggests that the problem(s) are not Workstation or open-vm-tools issues but something tied directly to the Linux kernel release and/or the NVME driver. Thanks for the 2020 bug reference.
Has the problem been raised with the Linux vendor(s)?
I totally get why my report is out of place. I really do. And since I consider myself more of a user than a tech savy person I also understand the reluctance to consider my comment a real issue.
This bug is still open, if I am reading it right.
The same issue on https://github.com/vmware/open-vm-tools/issues/579 , also unsolved.
The same issue on https://github.com/clearlinux/distribution/issues/2121 , also unsolved.
And there is my issue on ubuntu.
Three different kernels, linux brands, same issue. I thought bringing the people together who actually have the tech knowledge to get behind this issue might be helpful (but that's just me).
FYI this bug still occurs on Fedora 36 guest with Linux kernal 5.19, so not sure if it's a kernel bug
I don't even know where to report this for ubuntu, to be totally honest. Like I said, I am a user who just stumbled over his ubuntu logs and googled them, found the entries here and Clear Linux distro. Do you have any suggestion what to do to help solving this riddle?
edit** Found the ubuntu launch pad for reporting a bug, trying to fill out a (hopefully not completely incompetent) bug report there.
Posted it on https://bugs.launchpad.net/launchpad/+bug/1991291 fyi
Launchpad is the bug tracking system used by Ubuntu. Also a search of "Ubuntu how to file a problem report" led me to https://help.ubuntu.com/community/ReportingBugs. This provides some additional info about bug reporting tools that make it easy to capture crash dumps and system dumps for upload, if needed.
Actually I'm probably wrong, I've searched for this issue across Google and see it mentioned in multiple Linux dists without mentioning VMware. So yes could be a Linux kernel issue. Others have also mentioned still present in 5.19
Sorry for necroposting, I personally experience this issue on stock Arch on physical hardware, also since ~5.19, so +1 for a linux kernel bug from me. (it just happened to me on 6.0.8, the issue seems to still be present nowadays)
UPDATE 28.11.: I upgraded to 6.0.9 10 days ago and I haven't encountered the issue once since then
UPDATE 6.2.2023: this is still an issue and occurs regularly (like once a week)
No problem, I really guess the issue source is unknown or sporadic. You mentioned 6.0.8, what distro are you using? I am on Ubuntu mainline, and the issue left with the first 6.0 kernel I installed, 6.0.3. And it stayed away until now, on 6.0.8.
So we are using the same kernel version, my issue gone, yours still there, hmm. Weird.
This is still an issue in kernel 6.1 - see https://bugzilla.kernel.org/show_bug.cgi?id=216809
Describe the bug
I see quite often warnings messages like
nvme nvme0: I/O 34 QID 13 timeout, aborting
in the journal and it correlates with I/O appearing to hang a bit. Why is this happening?My Fedora guest VM is running on an NVMe disk on latest VMware Workstation 16.2.3 Windows 10 host. I am using the latest firmware and drivers for the NVMe disk.
Reproduction steps
I can reproduce it on my setup by doing an
rsync
of very large files from one filesystem location to another on the Fedora guest VMExpected behavior
No nvme timeout warning messages
Additional context
No response