tomsom / yoga-linux

Run Linux on the Lenovo Yoga 7 14 (14ARB7) with AMD Ryzen 6800U (Rembrand).
https://github.com/tomsom/yoga-linux/wiki
56 stars 2 forks source link

Samsung SSD removed when resuming from s2idle #9

Closed 0x9fff00 closed 11 months ago

0x9fff00 commented 1 year ago

Documenting this in case anyone else has this issue:

The Samsung PM9B1 512G SSD found in some units reports eui as 0001000200030004 when resuming from s2idle, causing the device to be removed with this error in dmesg:

nvme nvme0: identifiers changed for nsid 1

I’ve submitted an initial patch for this here: https://lore.kernel.org/all/20221116171727.4083-1-git@augustwikerfors.se/T/

Kernel Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=217649

Edit: see https://lore.kernel.org/all/20230731185103.18436-1-mario.limonciello@amd.com/ for a more general patch rejected due to concerns about data corruption for other devices, use the quirk patch instead

stuarthayhurst commented 1 year ago

Amazing, I was trying to troubleshoot this myself. It seemed similar to this issue, but the kernel options that worked for them didn't work for me.

Okazakee commented 1 year ago

Any news on the fw update?

0x9fff00 commented 1 year ago

Any news on the fw update?

Kanchan Joshi at Samsung responded here: https://lore.kernel.org/all/20221206055928.GB24451@test-zns/

Took more time than I wanted. Firmware team mentioned that issue existed in this firmware. This is fixed in new firmware, but bit of travel time is involved when official release from OEM (Lenovo) comes out.

Hope the information is sufficient, and quirk can go in.

If required, Acked-by: Kanchan Joshi \joshi.k@samsung.com\

Okazakee commented 1 year ago

Good news, I'm waiting this to finally switch to fedora, I do not remember if I can update the firmware later from gnome/some 3rd party app or do I just have to wait from windows update?

stuarthayhurst commented 1 year ago

I contacted Lenovo, they said their technical team is aware of the firmware and working on an update. Not sure if that'll be available through fwupd, or if it'll only be available through their Windows tools, but I'll update this when I get the update.

Okazakee commented 1 year ago

I finally switched totally to fedora, the fact that after all this time it is still broken kills my whole experience... Hope to see news asap.

stuarthayhurst commented 1 year ago

Been checking daily, no update yet. If they release something, it'll be on one of these 3 links: LFVS PM9B1 firmware updates Lenovo Yoga 7 Gen 7 Drivers & Downloads Lenovo Support

Previously, they've released an update to their NVMe firmware update tool, and added it to the drivers and download page for the affected device. Since this laptop isn't officially Linux supported, that's most likely, but we may see an update on the LVFS if they have any affected thinkpads.

EDIT: For some previous firmware releases, the update was done through Windows Update, so that's another option, but nothing yet either

Okazakee commented 1 year ago

I mean, i could just run an sd with windows just to update if it get released in WU, it is just a shame to see how much stuff like this gets scheduled on lowest priority, even for capitalistic companies...

stuarthayhurst commented 1 year ago

I went for something similar, I've kept mine as a dual boot with the minimum size for Windows to still be able to install updates. I'd rather run 100% Linux, but Windows is still needed for BIOS and firmware updates sadly.

Okazakee commented 1 year ago

I builded 6.1.7 kernel and booted in debug mode, it was working just the same as before the kernel switch, i followed this guide, the only thing i didn't follow up was the fact i put the patch with .txt extention instead of .patch

stuarthayhurst commented 1 year ago

What did you use to apply the patch to the kernel source?

Okazakee commented 1 year ago

What did you use to apply the patch to the kernel source?

I'm building it again with .patch, i was very tired at the time, it was just explained to be ".patch". I will let you know what happens in an hour or so

Okazakee commented 1 year ago

It builded, nothing changed at all

tomorrow i will try to build with this other guide

stuarthayhurst commented 1 year ago

Strange, if you're definitely applying the patch to a fresh build of the kernel, definitely booting the custom kernel and definitely have a Samsung PM9B1 SSD, I can't see why it wouldn't work

Okazakee commented 1 year ago

Strange, if you're definitely applying the patch to a fresh build of the kernel, definitely booting the custom kernel and definitely have a Samsung PM9B1 SSD, I can't see why it wouldn't work

edit: I just checked, it seems like my samsung ssd is another one @stuarthayhurst : model SAMSUNG MZAL4512HBLU-00BL2 (7L1QHXC7) serial n S67MNF1T654690

stuarthayhurst commented 1 year ago

Yep that's the correct SSD, its product name is PM9B1, but there's very little from Samsung published on it

Okazakee commented 1 year ago

Yep that's the correct SSD, its product name is PM9B1, but there's very little from Samsung published on it

Even trying to install the .cab from dell says no supported device

stuarthayhurst commented 1 year ago

Same for my SSD, it's got Dell's vendor ID on that firmware file or something along those lines, so the 2 SSDs don't appear the same to LVFS. Since it's an OEM SSD, we need Lenovo to publish the firmware themselves sadly

Okazakee commented 1 year ago

Same for my SSD, it's got Dell's vendor ID on that firmware file or something along those lines, so the 2 SSDs don't appear the same to LVFS. Since it's an OEM SSD, we need Lenovo to publish the firmware themselves sadly

I swear if not for the warranty I would already have swapped that with another ssd

Okazakee commented 1 year ago

New bios update, no ssd fw upgrade 💀

stuarthayhurst commented 1 year ago

I asked for an update from tech support, as they still hadn't publish the release after months. They said this:

Thank you for contacting Lenovo technical support.

Please note that once the update is released, it will be available on our Lenovo Support site. Thank you for your patience in the meantime.

For any further questions, do not hesitate to contact us.

So no update on a time scale, but at least we know where it'll be uploaded :shrug:

Okazakee commented 1 year ago

I asked for an update from tech support, as they still hadn't publish the release after months. They said this:

Thank you for contacting Lenovo technical support.

Please note that once the update is released, it will be available on our Lenovo Support site. Thank you for your patience in the meantime.

For any further questions, do not hesitate to contact us.

So no update on a time scale, but at least we know where it'll be uploaded :shrug:

I honestly think they won't, there is not a single update inherent to ssd, "memory" or "storage". It is more plausible it comes through ms updates, or embedded in bios update or something like that

stuarthayhurst commented 1 year ago

There's probably nothing related to storage, as they haven't uploaded any drivers related to it yet, and it'll appear when they push a driver / update in that category. I can't say I'm impressed with the service, I'm not sure when to cut my losses and just replace the SSD and suck up any warranty problems.

Okazakee commented 1 year ago

There's probably nothing related to storage, as they haven't uploaded any drivers related to it yet, and it'll appear when they push a driver / update in that category. I can't say I'm impressed with the service, I'm not sure when to cut my losses and just replace the SSD and suck up any warranty problems.

I hope to resist till this summer, I am just greatful the laptop battery is just awsome with amd p state implementation.

Okazakee commented 1 year ago

No news i guess, right? Ridiculous

stuarthayhurst commented 1 year ago

I'm stuck on this one - the FW team reached out to Samsung to see if there were fixes that we should be picking up and Samsung reported back that there are no Linux issues reported against this part :(

I'll try reaching out to the Samsung person on the upstream mailing list...because this is going nowhere fast right now.

Taken from https://github.com/fwupd/firmware-lenovo/issues/308, hopefully this leads somewhere

stuarthayhurst commented 1 year ago

1) Fix issue that display SSD number. 2) Fix issue that the ssd will lost after doing crisis.

New BIOS, found this in the changelog.. I'm busy right now, but I'll trial this later, hopefully that's what we needed.

Okazakee commented 1 year ago
  1. Fix issue that display SSD number.
  2. Fix issue that the ssd will lost after doing crisis.

New BIOS, found this in the changelog.. I'm busy right now, but I'll trial this later, hopefully that's what we needed.

installing it now

edit: Nothing to see, same issues as before, now it keeps a blackscreen for some time then show unmounting errors (sometimes)

SeekingGoodTech commented 1 year ago

I was able to get suspend/resume working with the patch that 0x9fff00 provided. I wonder if the Linux kernel policy to not include this quirk is appropriate given there has been no firmware update issued.

Okazakee commented 1 year ago

I was able to get suspend/resume working with the patch that 0x9fff00 provided. I wonder if the Linux kernel policy to not include this quirk is appropriate given there has been no firmware update issued.

are you on fedora? if so, can you tell me how to correctly patch my kernel?

SeekingGoodTech commented 1 year ago

are you on fedora? if so, can you tell me how to correctly patch my kernel?

I am on Ubuntu. I followed the instructions to build a kernel from here: https://davidaugustat.com/linux/how-to-compile-linux-kernel-on-ubuntu I imagine there would be similar instructions somewhere for fedora.

In my case, I downloaded the source for 6.2.11 from kernel.org. I applied the patch directly to the downloaded kernel source. Before making the kernel, I edited the following file: linux-6.2.11/drivers/nvme/host/pci.c I used gEdit but you can use whatever editor you're comfortable with. To add the quirk, change line 3450 & 3451 to read: { PCI_DEVICE(0x144d, 0xa80b), /* Samsung PM9B1 256G and 512G */ .driver_data = NVME_QUIRK_DISABLE_WRITE_ZEROES | NVME_QUIRK_BOGUS_NID, },

then made the kernel per the instructions in the above link. After installing the kernel and rebooting, I now have suspend and resume working. I also needed to disable Secure Boot in BIOS to allow this new kernel to boot.

shiftyphil commented 1 year ago

Some progress according to MarkRHPearson on Lenovo forums.

https://forums.lenovo.com/t5/Other-Linux-Discussions/Firmware-Update-for-Samsung-SSD-PM9B1-in-Yoga-7-Gen-7-14ARB7/m-p/5196929?page=2#5984302

It will take time to get this implemented, qualified and rolled out so guidance is expect to see the FW updates show up in July.

Not sure why this didn't get communicated from Samsung to Lenovo back in December, but at least we're getting there now.

Okazakee commented 1 year ago

Some progress according to MarkRHPearson on Lenovo forums.

https://forums.lenovo.com/t5/Other-Linux-Discussions/Firmware-Update-for-Samsung-SSD-PM9B1-in-Yoga-7-Gen-7-14ARB7/m-p/5196929?page=2#5984302

It will take time to get this implemented, qualified and rolled out so guidance is expect to see the FW updates show up in July.

Not sure why this didn't get communicated from Samsung to Lenovo back in December, but at least we're getting there now.

I cannot believe my eyes, not only a confirm of the issue, but an ETA. I'm going to cry now, thx for the news

Okazakee commented 1 year ago

Any news on implementation? I am just waiting for new bios hoping for the best

stuarthayhurst commented 1 year ago

It's a firmware update for the SSD, which is unlikely to come via BIOS. Lenovo have an SSD firmware update tool here, which I'd assume is most likely (Windows only). Since the SSD has also shown up in a Linux supported model, it might appear on the LVFS, here. I've been checking those 2 and the generic support download page for our laptop every few days, I'll update as soon as I've got something.

stuarthayhurst commented 1 year ago

Hm, another new firmware for Dell laptops with the PM9B1 SSD, so that's 3 that Dell have released in the time Lenovo have released 0.

realquaker commented 1 year ago

I bought new Samsung Evo 1TB. It works fine, no issues. And then sold buggy OEM SSD to Windows user.

Okazakee commented 1 year ago

I bought new Samsung Evo 1TB. It works fine, no issues. And then sold buggy OEM SSD to Windows user.

Do you know if it can fit 80mm ssds?

stuarthayhurst commented 1 year ago

From https://psref.lenovo.com/syspool/Sys/PDF/Yoga/Yoga_7_14ARB7/Yoga_7_14ARB7_Spec.pdf, it has a 2280 slot. I took mine apart to test an AX210 card, and can confirm it has one. Since we have PM9B1s, there's a metal insert to make the short SSD fit in a 2280.

0x9fff00 commented 1 year ago

This was brought up on the kernel mailing list again due to a Bugzilla report. Also, someone else submitted the quirk on 11 June but without mentioning suspend, I've asked if it's the same issue.

Okazakee commented 1 year ago

Documenting this in case anyone else has this issue:

The Samsung PM9B1 512G SSD found in some units reports eui as 0001000200030004 when resuming from s2idle, causing the device to be removed with this error in dmesg:

nvme nvme0: identifiers changed for nsid 1

I’ve submitted an initial patch for this here: https://lore.kernel.org/all/20221116171727.4083-1-git@augustwikerfors.se/T/

Kernel Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=217649

Do you still have the same error? I need a screenshot of the error to get a free SSD replacement from Lenovo, my problem is that on Tumbleweed the error does not appear, it just stop working or outputs other errors

stuarthayhurst commented 1 year ago

If the system doesn't totally die, swap to another TTY and try your luck with journalctl or dmesg.

EDIT: You might have better luck if you open the TTY first to something like dmesg -w, then trigger the crash

Okazakee commented 1 year ago

If the system doesn't totally die, swap to another TTY and try your luck with journalctl or dmesg.

EDIT: You might have better luck if you open the TTY first to something like dmesg -w, then trigger the crash

It worked with the smesg watch, thank you very much.

EDIT: I suggest you all to do this, it's a free replacement, an IBM technician will replace the SSD in my house, so i will just clone the current partitions in an img using clonezilla and then restore the img to the new ssd.

stuarthayhurst commented 1 year ago

EDIT: I suggest you all to do this, it's a free replacement, an IBM technician will replace the SSD in my house, so i will just clone the current partitions in an img using clonezilla and then restore the img to the new ssd.

I've sent my laptop in for repairs twice already (dead USB board and dodgy trackpad), I really can't be arsed to send it back again. TBH I'll probably just throw a 1 or 2 TB, higher spec SSD in next time I see a good sale.

Just a heads up, this laptop seems very reliant on EFI vars to boot the right OS. After resetting the laptop to send in for repairs and then dding the original image back on, it fails to boot. Livebooting, mounting the partitions, loading and mounting the efivars, chrooting in and reinstalling GRUB fixes this reliably.

Strange, since USB sticks seem to boot fine. Perhaps it just skips scanning the internal drive for bootable partitions?

shiftyphil commented 1 year ago

Just a heads up, this laptop seems very reliant on EFI vars to boot the right OS.

GRUB seems to not install to the default location (EFI\boot\bootx64.efi) these days. Might be a distro decision or a GRUB decision, possibly based on whether Windows Boot Manager is there already.

Anything that wipes the EFI variables leads to an annoying recovery process to restore them, or you can copy GRUB (or something like fallback.efi that can automatically set the EFI variables) to the default location.

superm1 commented 1 year ago

Does this patch help?

0001-nvme-Don-t-fail-if-NSIDs-change-on-resume.patch

0x9fff00 commented 1 year ago

@superm1 With that patch applied I still get the identifiers changed for nsid 1 message but the SSD now works after resume. Thanks!

superm1 commented 1 year ago

@superm1 With that patch applied I still get the identifiers changed for nsid 1 message but the SSD now works after resume. Thanks!

Yup fully expected. I'll send some variation of this upstream for discussion soon and CC you when I do.

superm1 commented 1 year ago

OK here's the version I think I can take for discussion. Should functionally work the same and show the error once after the first suspend/resume cycle. Can you please test it as well?

v2.txt

0x9fff00 commented 1 year ago

@superm1 Yes that patch works too. On the first resume I get...

nvme nvme0: identifiers changed for nsid 1
nvme nvme0: use of /dev/disk/by-id/ may cause data corruption

...and on later resumes I instead get

nvme nvme0: Ignoring bogus Namespace Identifiers