QubesOS / qubes-issues

The Qubes OS Project issue tracker
https://www.qubes-os.org/doc/issue-tracking/
534 stars 47 forks source link

Computer doesn't recover from suspend state #3705

Open fnerdman opened 6 years ago

fnerdman commented 6 years ago

Qubes OS version:

R4.0 rc5

Affected component(s):

Resume after suspend


Steps to reproduce the behavior:

About one of five times I put my laptop in suspend state, it fails to recover with screen staying black and fans spinning up to max.

Expected behavior:

Resume after suspend should work without sys-usb started.

Actual behavior:

Computer fails to recover with screen staying black and fans spinning up to max.

General notes:

A few times I was able to get the computer to recover by eg. plugging in the power cord or holding down the power button for a few seconds. Below you can see the sys log of the last time it happend. Be aware that I had tlp activated and a script that modifies intel p states and turbo mode depending on whether the charger was plugged in or not. The problem persists without, however I was not able to recover yet without tpl and the script enabled. I don't think these things are related however since the cpu soft lockup happens much earlier than the scripts in the wakeup process. If the computer had successfully recovered such a lockup I wasn't able to reproduce the problem until reboot.

Syslog after recovery


Related issues:

3689

toserk commented 6 years ago

I have same issue on HP Probook 450 G5 (i7-8550u). I noticed that I get a bunch of ACPI (Method not supported…) errors on qubes boot. So, I tried to install windows and official HP drivers to check if this is a hardware problem. And I got ACPI.sys problem, described in this post https://h30434.www3.hp.com/t5/Business-Notebooks/ProBook-450-with-high-CPU-usage/td-p/6520063 This error has very similar behavior, except for the system freeze. It occurs with some probability after suspend/resume, and fan works at max speed when it occurs. Maybe this information will help determine what the problem is.

fnerdman commented 6 years ago

@toserk can you dump your dmesg here so i can compare with mine? Do you experience similiar problems as in #3689?

marmarek commented 6 years ago

This might be BIOS bug, are there any BIOS updates available for this machine?

toserk commented 6 years ago

I have latest available versions of BIOS, USB 3.1 Controller firmware, and Intel Management Engine firmware. This is the dmesg dump with suspend/resume cycles until system freeze dmesg1.txt

I'm not sure that this is not a coincidence, but every time I tried to check #3689 I got freeze on first suspend/resume.

fnerdman commented 6 years ago

Comparing my dmesg I cannot find similiar ACPI errors in my log: dmesg.txt

However there is one line which is similar:

ACPI BIOS Warning (bug): Incorrect checksum in table [FACP] - 0x21, should be 0x57 (20170728/tbprint-211)

Could a wrong value in the Fixed ACPI Description Table be responsible for this behavior? I've extracted the facp table and disassembled it. facp.dsl.txt Inside there are actually values for the "sleep status register" and "sleep control register". I might be able to fix the FACP bug, but I might also try some different bioses of this laptop and see, whether the behavior stays the same.

mirrorway commented 6 years ago

My corebooted Thinkpad recently stopped resuming from suspend. Instead, it would restart.

I was able to workaround this by reverting the recent microcode update: dnf remove microcode_ctl and then removing ucode=scan from the Xen command line. I confirmed the microcode was reverted by running cat /proc/cpuinfo | grep microcode before and after.

More people might have this issue the next time they restart their laptops, when they load the new microcode...

fnerdman commented 6 years ago

sorry, cant reproduce that. I've run it with stock Kabylake R microcode 0x70 and updated microcode 0x80. Problem persists.

marmarek commented 6 years ago

@lead4good what xen-hypervisor package do you have? 4.8.3-4 have problems with suspend, try downgrading to 4.8.3-3

fnerdman commented 6 years ago

@marmarek I've got 4.8.3-3 installed (4.0 r5 without any testing repo updates)

evilaliv3 commented 6 years ago

I report this same issue on a Thinkpad T480 and a fresh installed Qubes 4

I managed to get the suspend to work by removing the USB3 controller from sys-usb; Obviously removing the USB3 controller i'm loosing possibility to attach USB devices so that this represents just a short term fix to enable suspend/resume to work correctly but proper fix should still be identified.

fnerdman commented 6 years ago

@toserk does your laptop have discrete graphics? An nvidia mx150 by chance?

toserk commented 6 years ago

@lead4good No, only Intel igpu (HD 620)

maertsen commented 6 years ago

I can confirm the observations made by @evilaliv3 (Thinkpad T480, fresh installation, USB3-controller removed fixes issue).

I have experimented with unloading xhcd_pci within sys-usb, as a random guess, but this does not make any difference.

Update: it seems #3689 has more details concerning the T480, I will take my comments there. Sorry for the noise.

evilaliv3 commented 6 years ago

I finally managed to get this working on my laptop (Thinkpad T480)

It required to configure the USB3 controllers to behave as USB2 controllers

Details on how to achieve this are descrived in: https://www.systutorials.com/241533/how-to-force-a-usb-3-0-port-to-work-in-usb-2-0-mode-in-linux/

Specifically on Thinkpad T480 this is achievable by issuing: setpci -H1 -d 8086:7020 d0.l=0 setpci -H1 -d 8086:9d2f d0.l=0

The commands should be executed inside the sys-usb domain. You could first them and if the fix works you may add them to /etc/rw/rc.local to have them be executed automatically at any boot of the domain.

\cc @maertsen @Scinawa

fnerdman commented 6 years ago

As of 4.8.3-8 the problem persists on my hardware.

maertsen commented 6 years ago

I've just tried the workaround suggested by @evilaliv3, both with xen-hypervisor at 4.8.3-3 and 4.8.3-7. In both cases, the systems freezes after wakeup, though I get to type some characters in the xscreensavers password prompt. After the freeze, the fan speeds up.

@evilaliv3, can you state your version of xen-hypervisor and other packages you deem relevant? I see mention of fixes in 4.8.3-8 in #3689, which may or may not be related. I'm waiting for 4.8.3-8 to land in qubes-dom0-current, though can test if required.

evilaliv3 commented 6 years ago

I'm with 4.8.3-7

As i wrote i continue to confirm that the fix above fixed the situation for me as the issue no longer happened.

maertsen commented 6 years ago

I have just retested with 4.8.3-8. Issue remains.

The workaround to shutdown sys-usb prior to suspend also still works. The usb 2 downgrade as suggested by @evilaliv3 does not work for me.

I am interested to hear pointers on how to further debug this issue.

maertsen commented 6 years ago

@evilaliv3 I just noticed that the setpci command does not appear to have any effect for 8086:9d2f as verified by lspci -xxxx. It remains at value 02 for d0. Is that different for you?

evilaliv3 commented 5 years ago

@maertsen: have you found any permanent solution?

Do you have any working scirpt that performs shutdown of sys-usb before suspend and resume it after the system awake?

I just found out that when the system do not awake, it is possible to make it resume but plugging and unplugging the power plug; strange but real. \cc @marmarek

b-m-f commented 5 years ago

Same issue for me on T460s. The fans dont speed up, but I am not able to resume from a suspended state. The power LED blinks, but pressing it has no effect.

maertsen commented 4 years ago

@maertsen: have you found any permanent solution?

Unfortunately not. I use the workaround you suggested in https://github.com/QubesOS/qubes-issues/issues/3705#issuecomment-389804267. I think this only affects the usb c ports on the left side, which I do not use save for charging. It's not much of a fix though.

b-m-f commented 4 years ago

I fixed this by changing to TPM 1.2 in the BIOS settings. More details

maertsen commented 4 years ago

I fixed this by changing to TPM 1.2 in the BIOS settings. More details

That blog is down for me; any chance you can copy the relevant bits here?

b-m-f commented 4 years ago

@maertsen the blog is back online. This is the most important part I guess

# Not resuming from Suspend

This was one of the main problems I had. The solution to this was hiding in the mailing list: the TPM.

Being set to Intel PTT TPM 2.0 in the BIOS by default this will not give the desired Intel TXT support.
So I went ahead and changed it to Discrete TPM 1.2 and voila, the laptop wakes up from suspension again.
Be aware that changing this setting will delete all keys on the security chip.
andrewdavidwong commented 1 year ago

Is this still a problem in 4.1?

Scinawa commented 1 year ago

I don’t use qubes anymore :(

On Sat, 8 Apr 2023 at 06:48, Andrew David Wong @.***> wrote:

Is this still a problem in 4.1?

— Reply to this email directly, view it on GitHub https://github.com/QubesOS/qubes-issues/issues/3705#issuecomment-1500704561, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAWNY4O3PK3I5SLIY6US7O3XACKSPANCNFSM4EVXM67A . You are receiving this because you were mentioned.Message ID: @.***>

--

Ale.

evilaliv3 commented 1 year ago

Ale: I don’t use qubes anymore :(

that's sad bro'!