intel / Intel-Linux-Processor-Microcode-Data-Files

Other
620 stars 68 forks source link

Failed to resume from S3 if in battery mode #34

Closed acelan closed 3 years ago

acelan commented 4 years ago

It stuck while resuming from S3 without AC. https://bugs.launchpad.net/ubuntu/+source/intel-microcode/+bug/1883336

Sku: Dell Precision 7730 BIOS: 1.12.1 (11/11/2019) Kernel: Ubuntu 4.15.0-1081-oem & mainline 5.7.0

  1. Boot up the system(with AC or Bat only)
  2. enter S3, unplug AC if plugged, can't wake up the system
  3. enter S3, plug AC, the system can be waken up by power button

Can't reproduce this issue with the intel-microcode=3.20180312.0~ubuntu18.04.1 in bionic repo. After upgrade this package to intel-microcode=3.20200609.0ubuntu0.18.04.1 or the previous version leads to the resume hang issue.

May 26 03:28:10 u-Precision-7730 kernel: [ 0.000000] microcode: microcode updated early to revision 0xca, date = 2019-10-03 May 26 03:28:10 u-Precision-7730 kernel: [ 1.565815] microcode: sig=0x906ea, pf=0x20, revision=0xca May 26 03:28:10 u-Precision-7730 kernel: [ 1.566291] microcode: Microcode Update Driver: v2.2.

Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 12 On-line CPU(s) list: 0-11 Thread(s) per core: 2 Core(s) per socket: 6 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 158 Model name: Intel(R) Xeon(R) E-2186M CPU @ 2.90GHz Stepping: 10 CPU MHz: 800.139 CPU max MHz: 4800.0000 CPU min MHz: 800.0000 BogoMIPS: 5808.00 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 12288K NUMA node0 CPU(s): 0-11 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp md_clear flush_l1d

stevebeattie commented 4 years ago

Acelan, just to confirm, this failure to resume occurs with both the 0x00ca from the 20191115 release as well as the 0x00d6 version from the 20200609 release, correct?

Thanks.

zfil commented 4 years ago

I've a similar configuration and the problem occurs with 20191115 and 20200609. With 20191115 the problem is even worse: I can't boot in battery mode it hangs at "loading initial ramdisk"

acelan commented 4 years ago

@stevebeattie, right, this issue could be reproduced on both version of microcode.

mbidewell commented 4 years ago

I have a Dell Precision 7530 and have hit this issue as well. For me, firmware 20191115 works but 20200609 does not.

bionade24 commented 4 years ago

I have a Dell Precision 7540 with an i9-9980HK and only have the problem, that I can't boot without AC plugged in. @zfil Have you already opened another issue for that?

new0ne commented 4 years ago

I am facing the exact same issues on the same machine: Precision 7540 also with a i9-9980HK:

bionade24 commented 4 years ago

As it only affects Dell we should all complain at the Dell support and link this bug that they'll fix it soon, especially if you paid for the support.

hmh commented 4 years ago

The standard vendor fix will be to update the firmware, in which case the OS stops updating the microcode, and the bug is not triggered.

The underlying issue is not going to be addressed by Dell, regardless of where it lies (Dell firmware, Intel microcode, Linux S3 resume path or SMP/AP bringup).

qyra commented 3 years ago

I've had this issue as well on my dell precision 7530. Interestingly, neither of my coworkers who also have the same model with identical specs ever had issues with this so whatever the issue is, it might be related to some bad hardware batch or something like that.

Initially when I was running ubuntu 18.04 I had intel-microcode 3.20200609.0ubuntu0.18.04.1 and could never resume from suspend on battery. I followed the answer here to fix it by installing intel-microcode=3.20180312.0~ubuntu18.04.1 instead and this fixed things.

After upgrading to ubuntu 18.04 I could not install that version any more, the only versions I see are 3.20200609.0ubuntu0.20.04.2, and 3.20191115.1ubuntu3 The latest version of these broke sleep (again), so I installed 3.20191115.1ubuntu3 from the 20.04 packages and this worked fine.

I'll be bugging Dell support about this because I'd really like this to get fixed before the next LTS release. Let me know if there's anything I can test.

esyr-rh commented 3 years ago

06-8e-0a microcode has been updated to revision 0xe0 in microcode-20201110 release, does the newer microcode revision help?

tschwinge commented 3 years ago

As I report on https://www.dell.com/community/Precision-Mobile-Workstations/Precision-7540-will-only-boot-Linux-if-charger-is-connected/m-p/7739866/highlight/true#M4605 regarding my Dell Precision 7530:

It seems as with the most recent BIOS Revision 1.14.4 (13 Nov 2020) in combination with most recent intel-microcode package version 3.20201110.0ubuntu0.18.04.2, the issue is resolved: system boots fine, no workarounds or package downgrades necessary anymore. :-)

esyr-rh commented 3 years ago

On Tue, Nov 17, 2020 at 12:40:29AM -0800, Thomas Schwinge wrote:

It seems as with the most recent BIOS Revision 1.14.4 (13 Nov 2020) in combination with most recent intel-microcode package version 3.20201110.0ubuntu0.18.04.2, the issue is resolved: system boots fine, no workarounds or package downgrades necessary anymore. :-)

Usage of the latest firmware revision may render OS microcode package version irrelevant, as the former may contain the latest microcode revision available, thus eliminating the need of OS-driven microcode update (that may be the culprit of the initial report).