pop-os / linux

Pop!_OS fork of https://launchpad.net/ubuntu/+source/linux
Other
110 stars 13 forks source link

Linux 6.7.2 #295

Closed mmstick closed 3 months ago

leviport commented 4 months ago

Here's where I'm at currently. Please check off anything I haven't gotten to yet.

Kernel Releases Tests

leviport commented 4 months ago

All tests on Dev One are also complete. My Dev One is happy with 6.7.2

XV-02 commented 4 months ago

I am seeing an failures to resume from suspend on Pang12 with the 6.7.2 kernel. Journalctl logs report the system entering an s2idle suspend as the last message. I want to check with other drives, as I was using a WD550 to test, and those have had known issues before which have impacted suspend.

They aren't consistent, either, so I want to look into it a bit more.

XV-02 commented 4 months ago

Okay, so, things I have seen :

Doesn't appear to be drive specific, which is kinda a plus, I suppose.

I am yet to have the Pang12 fail on the first suspend/resume. It has reliably failed to resume after no more than the third attempted suspend-resume cycle. It doesn't appear to matter how suspend is triggered. fwts, systemctl suspend, using autosuspend, changing power-button behaviour, or closing the lid all had the same results.

The other aspect is that I am sometimes seeing logs cut off in the middle of entering suspend. That makes me think that it's less of a resume issue than a suspend issue.

I'm going to try a slightly newer kernel (6.7.4) at this junction and see if the issue persists.

XV-02 commented 4 months ago

Testing the Ubuntu builds of 6.7.3 and 6.8.rc3 and building and checking 6.7.4 - all of them exhibit the same issue of failure to suspend/resume within 3 attempts.

I'm not certain where in the process the failure is occurring. journalctl logs are inconclusive, usually ending after reporting that they have reached the s2idle sleep state. However, some of them have cut out before that point - which makes me think that it may be tied more to the suspend side of the process than the resume part. dmesg is not recoverable, and the Pangolin is not an open ec/firmware system, so I can't leverage those tools as far as I am aware.

Most often, I'm seeing a failure during my second attempt to suspend and resume on a given boot, though I have had failure which appear the same on both my first and third attempts. I have never had three successful consecutive attempts to suspend and resume on a single boot.

The current 6.6 series kernel doesn't exhibit these issues.

I'm going to look at Ubuntu's pre-6.7.2 builds in the 6.7 series, and see if any of those exhibit the issue to try and narrow it down. I'll look at post 6.6.10 builds in the 6.6 series after that, and try and make the footprint for finding which change introduced this issue smaller.

XV-02 commented 4 months ago

Hosanna!

It looks like whatever is causing our Pang12 pains was introduced between 6.7.rc5 and 6.7.rc6

RC5 has no issues entering into suspend and resuming many times in a row. RC6 however, fails after no more than 2 or 3 cycles. Hopefully this will help to narrow down the root cause for this.

leviport commented 4 months ago

I think this also made larger initramfs files? My Dev One now has a full ESP (it still has the original install with the 500mb ESP) even though I switched the initramfs compression method to xz instead of zstd. That's something we should probably be careful of, as I'm sure plenty of users still have 500mb ESP's.

leviport commented 4 months ago

Pang13 also appears to be affected by the suspend breaking bug.

Pang14 is unaffected.

leviport commented 4 months ago

On pang13, kernel 6.8.0-rc1 restores correct suspend behavior.

jackpot51 commented 4 months ago

This will need to pick #298