system76 / firmware-open

System76 Open Firmware
Other
957 stars 86 forks source link

darp8 with `2023-08-18_a8dd6c2` wont come out of suspend #469

Closed uSpike closed 8 months ago

uSpike commented 1 year ago

I cannot bring the system out of suspend. I can bring the system into suspend, the power light starts flashing. If I press a keyboard key the power light stops flashing and is steady-on green, but the screen doesn't turn on and the system seems completely unresponsive.

This seems to happen whether the system is charging or not.

Expected behavior

The system should come out of suspend back to the desktop.

Actual behavior

Repeating myself here

Additional info

It seems that linux is never reentered, here's output from journalctl -xe -b-1

Sep 01 08:51:05 system76-pc systemd[1]: Starting System Suspend...
░░ Subject: A start job for unit systemd-suspend.service has begun execution
░░ Defined-By: systemd
░░ Support: http://www.ubuntu.com/support
░░ 
░░ A start job for unit systemd-suspend.service has begun execution.
░░ 
░░ The job identifier is 4061.
Sep 01 08:51:05 system76-pc systemd[1]: grub-common.service: Deactivated successfully.
░░ Subject: Unit succeeded
░░ Defined-By: systemd
░░ Support: http://www.ubuntu.com/support
░░ 
░░ The unit grub-common.service has successfully entered the 'dead' state.
Sep 01 08:51:05 system76-pc systemd[1]: Finished Record successful boot for GRUB.
░░ Subject: A start job for unit grub-common.service has finished successfully
░░ Defined-By: systemd
░░ Support: http://www.ubuntu.com/support
░░ 
░░ A start job for unit grub-common.service has finished successfully.
░░ 
░░ The job identifier is 4153.
Sep 01 08:51:05 system76-pc systemd[1]: Starting GRUB failed boot detection...
░░ Subject: A start job for unit grub-initrd-fallback.service has begun execution
░░ Defined-By: systemd
░░ Support: http://www.ubuntu.com/support
░░ 
░░ A start job for unit grub-initrd-fallback.service has begun execution.
░░ 
░░ The job identifier is 4065.
Sep 01 08:51:05 system76-pc systemd[1]: grub-initrd-fallback.service: Deactivated successfully.
░░ Subject: Unit succeeded
░░ Defined-By: systemd
░░ Support: http://www.ubuntu.com/support
░░ 
░░ The unit grub-initrd-fallback.service has successfully entered the 'dead' state.
Sep 01 08:51:05 system76-pc systemd[1]: Finished GRUB failed boot detection.
░░ Subject: A start job for unit grub-initrd-fallback.service has finished successfully
░░ Defined-By: systemd
░░ Support: http://www.ubuntu.com/support
░░ 
░░ A start job for unit grub-initrd-fallback.service has finished successfully.
░░ 
░░ The job identifier is 4065.
Sep 01 08:51:05 system76-pc rfkill[7559]: block set for type bluetooth

it seems to not have any messages indicating it's coming out of suspend.

crawfxrd commented 1 year ago

What drives are you using?

uSpike commented 1 year ago

WD Blue SN570 250GB

Here's my full lshw if that's helpful

lshw.txt

crawfxrd commented 1 year ago

Does it happen every time?

I'm not able to reproduce this on Pop!_OS with a WD SN550, WD SN570, or Samsung 980 PRO.

uSpike commented 1 year ago

Unfortunately yes, 3 times now in a row. I'd rather not keep trying since I have to hard-reboot each time.

I'll try Pop_OS live CD and see if that reproduces.

uSpike commented 1 year ago

A Pop_OS liveCD exhibits the same behavior, for better or worse. I tried once.

uSpike commented 1 year ago

Also to be clear I had no problem with suspend before upgrading to this new firmware/EC.

Samuraid commented 1 year ago

Having the same problems and symptoms with a darp8. Suspend worked on 2022-11-21_b337ac6. Just upgraded to 2023-08-18_a8dd6c2 tonight and same problems: can enter suspend, green flashing light. Pressing a key causes the light to go steady green, but the screen is completely blank/black with no response. Even after waiting, nothing changes. Holding the power button down to force it off and doing a clean boot is the only option forward so far.

No external devices are connected. Being on AC power or battery, same behavior either way. Tried turning off WiFi and Bluetooth radios, doesn't seem to make a difference. System updates are current as of right now.

Attached you'll find output for lshw and journalctl starting at the point the suspend was initiated.

journalctl.txt lshw.txt

vtrenton commented 1 year ago

seeing the same problem on my DARP8 laptop gets very hot - when opening the lid it never wakes up. I have to force it off and boot it again. Started right after upgrade to latest firmware.

aaronscode commented 1 year ago

Seeing the same problem on my Darter Pro 8 after updating firmware as well, as described by others above.

ahoneybun commented 1 year ago

I am having the same issue on a Galago Pro running Ubuntu 22.04, but mine is intermittent, but there isn't an identifiable pattern. (1) Sometimes it wakes up simply from a mouse click; (2) Other times, I have to press the power button a couple of times for it to respond (however, the unresponsive mouse issue happens intermittently, which requires a reboot); (3) Other times, it does not respond at all and I have to restart the computer with the power button.

This issue is only about the darp8 and this firmware update, feel free to make a new report for your model.

crawfxrd commented 1 year ago

So far, the only suspend issue I've seem is fstrim failing to stop.

Freezing user space processes failed
[  277.737656] Freezing user space processes
[  297.741372] Freezing user space processes failed after 20.003 seconds (1 tasks refusing to freeze, wq_busy=0):
[  297.741534] task:fstrim          state:D stack:0     pid:4867  ppid:1      flags:0x00004006
[  297.741543] Call Trace:
[  297.741546]  <TASK>
[  297.741551]  __schedule+0x2cc/0x750
[  297.741564]  schedule+0x63/0x110
[  297.741570]  schedule_timeout+0x95/0x170
[  297.741576]  ? __pfx_process_timeout+0x10/0x10
[  297.741585]  io_schedule_timeout+0x51/0x80
[  297.741591]  wait_for_completion_io_timeout+0x81/0x150
[  297.741599]  submit_bio_wait+0x81/0xd0
[  297.741606]  blkdev_issue_discard+0x94/0xf0
[  297.741617]  ext4_issue_discard.constprop.0+0x83/0xf0
[  297.741624]  ext4_try_to_trim_range+0x1fc/0x3c0
[  297.741631]  ext4_trim_all_free+0xeb/0x200
[  297.741638]  ext4_trim_fs+0x2aa/0x330
[  297.741646]  __ext4_ioctl+0x613/0x1120
[  297.741651]  ? do_filp_open+0xaf/0x170
[  297.741657]  ext4_ioctl+0xe/0x20
[  297.741661]  __x64_sys_ioctl+0x9d/0xe0
[  297.741668]  do_syscall_64+0x58/0x90
[  297.741676]  ? putname+0x5d/0x80
[  297.741683]  ? do_sys_openat2+0xab/0x180
[  297.741692]  ? exit_to_user_mode_prepare+0x30/0xb0
[  297.741697]  ? syscall_exit_to_user_mode+0x29/0x50
[  297.741701]  ? do_syscall_64+0x67/0x90
[  297.741708]  ? syscall_exit_to_user_mode+0x29/0x50
[  297.741711]  ? do_syscall_64+0x67/0x90
[  297.741717]  entry_SYSCALL_64_after_hwframe+0x77/0xe1
[  297.741722] RIP: 0033:0x7f309591aaff
[  297.741729] RSP: 002b:00007ffd11690680 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  297.741734] RAX: ffffffffffffffda RBX: 00007ffd116907d0 RCX: 00007f309591aaff
[  297.741737] RDX: 00007ffd116906f0 RSI: 00000000c0185879 RDI: 0000000000000003
[  297.741740] RBP: 00005625119fece0 R08: 00005625119fece0 R09: 0000000000000000
[  297.741742] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000003
[  297.741745] R13: 00005625119ffd20 R14: 00005625119fecc0 R15: 00005625119fecc0
[  297.741749]  </TASK>
crawfxrd commented 1 year ago

Removing quiet and adding no_console_suspend to the boot options may provide additional info.

crawfxrd commented 1 year ago
aibistin commented 1 year ago

I'm having the same issue with a Darter Pro, Driver 20.04.79 .
This started happening after installing the latest firmware.

uSpike commented 1 year ago

Removing quiet and adding no_console_suspend to the boot options may provide additional info.

Unfortunately this offered no additional output on the screen when waking from suspend. It seems that the system is completely unresponsive in this state, even the Function keys for changing the keyboard backlight don't work.

jacobgkau commented 1 year ago

Just to clarify, for anyone that is experiencing this issue and would like to roll back to previous firmware in the meantime, you can do so by:

  1. Downloading the "previous release" ZIP file linked above.
    • The sha256sum is 286d2cbf97518e73438bd93993cedabb330b71c6b44969443e4ef353dc1a2cd5; you can check that your download was not corrupted by opening a terminal, running e.g. sha256sum ~/Downloads/2022-11-21_b337ac6.zip, and checking that your output matches.
  2. Extracting the contents to an empty, FAT32-formatted USB flash drive.
  3. Booting from that drive and running through the firmware updater (it will look similar to the over-the-air firmware update process).

We are still investigating the issue with a high priority and we apologize for the inconvenience. Tim may continue asking for more information, and we appreciate all the details that have been provided so far.

leviport commented 1 year ago

We have reverted this firmware release for the darp8. I still haven't been able to recreate this bug, but we will get to the bottom of it before re-releasing this firmware. Meanwhile if anyone has any other info they can share that might give us more clues about recreating the bug, I would definitely appreciate it.

vtrenton commented 1 year ago

Hey System76 Team, First of all - thanks for taking this seriously and being so quick to respond! I frequently close the lid on my laptop when I'm finished and I haven't been able to reproduce it with a 100% constancy as well. It seems to only happen sometimes but began happening right after the latest firmware upgrade. I managed to catch a video of what it looks like for me and upload it to a youtube video if this helps at all. https://youtu.be/SxpgQpQU8dc

leviport commented 1 year ago

Thanks for the video! It looks like both lights on the right side were solid green before the lid was opened, am I seeing that right? That probably means the machine didn't even make it all the way into suspend, since the back LED isn't blinking.

vtrenton commented 1 year ago

Good callout! I didn't even notice that myself. What is interesting is it looks like the system is trying to reach the suspend state but is never actually making it to that point. Since I mostly suspend my laptop instead of powering it down it's easy for me pull boot logs on old boots with journalctl -b -1 this doesn't give me a lot of information other than i can see the system "Starting Suspend".

These are the last messages in the journal file. Sep 05 00:58:34 athena systemd[1]: Reached target Sleep. Sep 05 00:58:34 athena systemd[1]: Starting System Suspend... Sep 05 00:58:34 athena rfkill[33551]: block set for type bluetooth

Hope this information helps some!

JRHaven commented 1 year ago

My darp8 is doing this consistently with the latest firmware. The light starts flashing after suspending, but after hitting the power button, it goes solid green and the display doesn't wake up. I also recorded a video of me demonstrating the entire process, if it shows you anything you haven't seen. It happens to me 100% of the time I suspend my system, and it's really ruining my workflow. But yes, thank you for everything you have already done, I'm reverting my firmware right away.

leviport commented 1 year ago

Thanks for the video! I have a hunch that the two different suspend (or lack thereof) behaviors are meaningful, but I'm still not sure how exactly.

MrPenguin07 commented 1 year ago

Just to be the contrarian;

Since updating my darp8 to 2023-08-18_a8dd6c2 my suspend finally IS working. By that I mean compared to previous firmware it isn't hot when I pull it out of my carry bag and it isn't chewing ~5% battery per hr anymore.

It's fixed one of my biggest gripes ... I may have an issue with keyboard backlight however still unsure if it's my config.

DrymarchonShaun commented 1 year ago

It's fixed one of my biggest gripes ... I may have an issue with keyboard backlight however still unsure if it's my config.

Does it happen to be similar to #399?

MrPenguin07 commented 1 year ago

It's fixed one of my biggest gripes ... I may have an issue with keyboard backlight however still unsure if it's my config.

Does it happen to be similar to #399?

Negative;

For me the keyboard backlight will work fine via fn keys during post, though shortly after loading kernel they turn off and no longer respond at all. However, this only happens when booting one of my OS - for windows, POP it works fine. The thing is, I didn't change anything on this particular OS - the /sys/class/leds/system76_acpi:kbd files are all there - and the fn keys actually change their values (brightness, color) but the keys just never turn on. Kernel module definitely loaded, older kernels known to work previously also don't now.

I want to blame my OS - however this literally started the day I upgraded firmware. For me, the -in my case- now fixed suspend and disabling of M.E are of greater benefit than the keys.

Still investigating the cause before I open an issue here.

MrPenguin07 commented 1 year ago

FWIW relevant to this thread, I suspend via loginctl --no-ask-password suspend with the 2023-08-18_a8dd6c2 firmware and have had no issues resuming.

... and quite enjoy taking it out of my bag and it's not hot to the touch and missing ~30% battery after few hours like previously.

uSpike commented 1 year ago

FWIW relevant to this thread, I suspend via loginctl --no-ask-password suspend with the 2023-08-18_a8dd6c2 firmware and have had no issues resuming.

... and quite enjoy taking it out of my bag and it's not hot to the touch and missing ~30% battery after few hours like previously.

I tried systemctl --no-ask-password suspend and I still got the same behavior as originally explained.

MrPenguin07 commented 1 year ago

FWIW relevant to this thread, I suspend via loginctl --no-ask-password suspend with the 2023-08-18_a8dd6c2 firmware and have had no issues resuming. ... and quite enjoy taking it out of my bag and it's not hot to the touch and missing ~30% battery after few hours like previously.

I tried systemctl --no-ask-password suspend and I still got the same behavior as originally explained.

I rebooted into PopOS and tried to replicate what yourself and others see, thought may be a Systemd interaction as I run OpenRC however once again suspend works perfectly well for me.

No issues with keyboard backlight on Pop either, only that the color changes back to white upon resume not sure if that's normal but no problem.

I hope dev's won't rebase back to the S3 "fix" previously had here....

crawfxrd commented 10 months ago

I will need a unit with the issue. All of our units do not have it.

I can only assume this is another case like galp5 where newer batches were changed in some way without us knowing.

The old firmware is still available in my comment above.

crawfxrd commented 10 months ago

If someone with the issue is willing to flash and test a custom build, the things I would need tested are:

uSpike commented 10 months ago

I'm willing to flash a custom build

vtrenton commented 10 months ago

There was a firmware downgrade pushed to my machine - as soon as it was applied the problem went away for me. Considering my issue is 'fixed' on 2022-11-21_b337ac6. I can see i now have a new 2023-09-08_42bf7a6 version available... But instead of applying that I'm willing to run a custom build as well for science!

uSpike commented 10 months ago

I also test/debug embedded x86 computers for my day job so feel free to ask me to do whatever :)

crawfxrd commented 10 months ago

These are built from 2e4e34bf83ff.

Extract the ZIP to a FAT32 volume on a GPT partitioned USB drive to use. Have a second drive with the old firmware in case you need to flash back via USB.

vtrenton commented 10 months ago

Ok currently running the 'both' firmware.

$ cat /sys/class/dmi/id/bios_version
2023-10-20_2e4e34b-dirty

I was also able to reproduce the issue on 2023-09-08_42bf7a6 as well.

uSpike commented 10 months ago

I'll test this evening

vtrenton commented 10 months ago

interesting... I was able to reproduce the issue with darp8-no-aer-rtd3.zip but not with darp8-no-aer.zip this firmware doesn't seem to have had the issue yet.... I'll give it a few more tries before moving on to darp8-no-rtd3.zip. Hope this helps narrow it down a bit.

vtrenton commented 10 months ago

@crawfxrd Just a heads up - I was able to reproduce the issue with all 3 firmware files provided.

andyrub18 commented 9 months ago

I'm currently using the firmware 2023-09-08_42bf7a6, I just updated this sunday and I began to have this issue as well. Do you have a stable firmware I can roll back to until the issue is resolved please?

leviport commented 9 months ago

@andyrub18 that was addressed further up in this thread: https://github.com/system76/firmware-open/issues/469#issuecomment-1708588635

andyrub18 commented 9 months ago

Thank you for your response

On Tue, Dec 12, 2023, 10:52 AM Levi Portenier @.***> wrote:

@andyrub18 https://github.com/andyrub18 that was addressed further up in this thread: #469 (comment) https://github.com/system76/firmware-open/issues/469#issuecomment-1708588635

— Reply to this email directly, view it on GitHub https://github.com/system76/firmware-open/issues/469#issuecomment-1852314297, or unsubscribe https://github.com/notifications/unsubscribe-auth/AL573WTD2G6YYE35S5FLLADYJB4VPAVCNFSM6AAAAAA4HT3B62VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNJSGMYTIMRZG4 . You are receiving this because you were mentioned.Message ID: @.***>

smithzvk commented 9 months ago

Had this issue with a8dd6c2, having it again with 42bf7a6. If there is anything I can do to help you diagnose or troubleshoot, please let me know.

crawfxrd commented 9 months ago

Might as well ask for lspci -nn output then.

If you have other drives to test with, try swapping them. If someone can reproduce it with only a Samsung 980 PRO installed, then I will be more convinced that it's not just AER/RTD3.

vtrenton commented 9 months ago

I think mine came with a 980 Pro and I added an extra one in as well. So i can confidently say I can reproduce this with a 980 Pro:

$ lsblk -d -o +MODEL | grep -v 'loop\|zram'
NAME    MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS                  MODEL
nvme1n1 259:0    0 931.5G  0 disk                              Samsung SSD 980 PRO 1TB
nvme0n1 259:1    0 931.5G  0 disk                              Samsung SSD 980 PRO 1TB

also here is the output of lspci -nn

00:00.0 Host bridge [0600]: Intel Corporation Device [8086:4621] (rev 02)
00:02.0 VGA compatible controller [0300]: Intel Corporation Alder Lake-P Integrated Graphics Controller [8086:46a6] (rev 0c)
00:06.0 PCI bridge [0604]: Intel Corporation 12th Gen Core Processor PCI Express x4 Controller #0 [8086:464d] (rev 02)
00:07.0 PCI bridge [0604]: Intel Corporation Alder Lake-P Thunderbolt 4 PCI Express Root Port #0 [8086:466e] (rev 02)
00:0a.0 Signal processing controller [1180]: Intel Corporation Platform Monitoring Technology [8086:467d] (rev 01)
00:0d.0 USB controller [0c03]: Intel Corporation Alder Lake-P Thunderbolt 4 USB Controller [8086:461e] (rev 02)
00:0d.2 USB controller [0c03]: Intel Corporation Alder Lake-P Thunderbolt 4 NHI #0 [8086:463e] (rev 02)
00:14.0 USB controller [0c03]: Intel Corporation Alder Lake PCH USB 3.2 xHCI Host Controller [8086:51ed] (rev 01)
00:14.2 RAM memory [0500]: Intel Corporation Alder Lake PCH Shared SRAM [8086:51ef] (rev 01)
00:14.3 Network controller [0280]: Intel Corporation Alder Lake-P PCH CNVi WiFi [8086:51f0] (rev 01)
00:15.0 Serial bus controller [0c80]: Intel Corporation Alder Lake PCH Serial IO I2C Controller #0 [8086:51e8] (rev 01)
00:15.1 Serial bus controller [0c80]: Intel Corporation Alder Lake PCH Serial IO I2C Controller #1 [8086:51e9] (rev 01)
00:16.0 Communication controller [0780]: Intel Corporation Alder Lake PCH HECI Controller [8086:51e0] (rev 01)
00:1c.0 PCI bridge [0604]: Intel Corporation Device [8086:51bd] (rev 01)
00:1c.7 PCI bridge [0604]: Intel Corporation Alder Lake PCH-P PCI Express Root Port #9 [8086:51bf] (rev 01)
00:1d.0 PCI bridge [0604]: Intel Corporation Device [8086:51b0] (rev 01)
00:1f.0 ISA bridge [0601]: Intel Corporation Alder Lake PCH eSPI Controller [8086:5182] (rev 01)
00:1f.3 Audio device [0403]: Intel Corporation Alder Lake PCH-P High Definition Audio Controller [8086:51c8] (rev 01)
00:1f.4 SMBus [0c05]: Intel Corporation Alder Lake PCH-P SMBus Host Controller [8086:51a3] (rev 01)
00:1f.5 Serial bus controller [0c80]: Intel Corporation Alder Lake-P PCH SPI Controller [8086:51a4] (rev 01)
01:00.0 Non-Volatile memory controller [0108]: Samsung Electronics Co Ltd NVMe SSD Controller PM9A1/PM9A3/980PRO [144d:a80a]
2d:00.0 SD Host controller [0805]: O2 Micro, Inc. SD/MMC Card Reader Controller [1217:8621] (rev 01)
2e:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 15)
2f:00.0 Non-Volatile memory controller [0108]: Samsung Electronics Co Ltd NVMe SSD Controller PM9A1/PM9A3/980PRO [144d:a80a]
Tinkr2347 commented 9 months ago

Also got the same problem on a Darp8 since the firmware upgrade. Screen doesn't wake up after lid re-open, have to hard reset.

crawfxrd commented 9 months ago

EC output from darp8 with debug logging enabled:

Suspend:

[11:13:43.146] peci_get_temp: response timeout
[11:13:43.281] VWCTRL1 4
[11:13:43.281] VWIDX7 17
[11:13:43.286] VW_HOST_RST_ACK = 11
[11:13:43.286] VWCTRL1 4
[11:13:43.286] VWIDX7 16
[11:13:43.290] VW_HOST_RST_ACK = 10
[11:13:43.331] VWCTRL1 1B
[11:13:43.331] VWIDX2 76
[11:13:43.331] VWIDX3 30
[11:13:43.331] ESPI PLTRST# 10
[11:13:43.335] VWIDX41 B3
[11:13:43.335] VWCTRL1 1B
[11:13:43.339] VWIDX2 76
[11:13:43.339] VWIDX3 30
[11:13:43.339] VWIDX41 B3
[11:13:43.343] POWER_STATE_S3

Resume:

[11:13:55.667] C0: Power switch press
[11:13:55.692] VWCTRL1 1B
[11:13:55.692] VWIDX2 76
[11:13:55.692] VWIDX3 30
[11:13:55.692] VWIDX41 B9
[11:13:55.696] VWCTRL1 1B
[11:13:55.696] VWIDX2 77
[11:13:55.696] VWIDX3 30
[11:13:55.700] VWIDX41 B9
[11:13:55.700] POWER_STATE_S0
[11:13:55.745] DE: Power switch release

There is no coreboot output on resume. Logs stop after update_power_state().

PLTRST# doesn't de-assert. Are we hanging the SoC?

DrymarchonShaun commented 9 months ago

Having built and flashed 2024-01-10_6c402c3 my darp8 will now resume after suspending.

EDIT: I think what I originally described is a separate issue, my darp8 is entering s0ix, but not reaching the lowest system power state.

andyrub18 commented 8 months ago

Having built and flashed 2024-01-10_6c402c3 my darp8 will now resume after suspending.

EDIT: I think what I originally described is a separate issue, my darp8 is entering s0ix, but not reaching the lowest system power state.

Where can I find this version?

DrymarchonShaun commented 8 months ago

I built it using the scripts included in this repo. I'd upload it but I'm not sure if the system76 team would want someone random uploading compiled binaries.

leviport commented 8 months ago

I'd upload it but I'm not sure if the system76 team would want someone random uploading compiled binaries.

Thanks, that's probably for the best.

The version should be released today, so the update should be available in the System76 firmware updater soon enough. That is in Settings > Firmware in Pop, or in the Firmware Manager application in Ubuntu with the System76 repo added.