desultory / ugrd

A minimalistic initramfs generator, designed for FDE
GNU General Public License v2.0
32 stars 11 forks source link

[help] hibernation support #82

Open julie-de-ville opened 2 days ago

julie-de-ville commented 2 days ago

Is hibernation resume support included automatically? I don't see a module for it, and my system supports hibernation, but it will not resume after suspending to disk. I am using gentoo with linux version 6.6.52

desultory commented 2 days ago

It does not currently have hibernation support, but this shouldn't be very hard to add.

Have you tried using the "resume=" kernel command line arg? I have not tested this but assumed it could be used to resume off of unencrypted swap.

julie-de-ville commented 2 days ago

That would be great, I found it much better than dracut. I have the resume parameter in grub.

desultory commented 2 days ago

Do you have encrypted swap? I think it shouldn't be too hard to add support for simple resuming, I'm just not sure why the builtin kernel parameter doesn't work alone. Maybe it ignores that option if an initrd is used?

desultory commented 2 days ago

https://wiki.gentoo.org/wiki/Custom_Initramfs/Hibernation

I've hesitated to add support in the initrd as there are many things which can go wrong. I think passing the supplied info to /sys/power/resume could be enough.

desultory commented 2 days ago

https://github.com/desultory/ugrd/commit/0175133404709c34686c811012d900aca4107077 I'm not sure about this, but I think it may be a reasonable start? resume= expects a device path, I'm not sure if it makes sense to resolve a UUID.

desultory commented 2 days ago

Right now, it enters a fail state if a resume partition is passed and it fails to resume, this means it won't normally boot. I'm not sure how much to consider those warnings about data loss. If it hibernated, and you reboot without considering the saved state, there could potentially be serious data loss, similar to if you did a hard shutdown. I think most systems just attempt to resume from swap if possible, but continue if not. I'm going to check/test a bit more

desultory commented 2 days ago

That would be great, I found it much better than dracut. I have the resume parameter in grub.

did you manually add the parameter? I think for the sake of safety, I will be forcing resume attempts if resume= is set. It's potentially very dangerous to start a system fresh if it expects to return from a hibernation state.

julie-de-ville commented 2 days ago

Yes, I manually added it to grub. I booted twice in that manner, but I haven't attempted resuming again since I have narrowed it down to initramfs. That is a good idea.

desultory commented 2 days ago

Yes, I manually added it to grub. I booted twice in that manner, but I haven't attempted resuming again since I have narrowed it down to initramfs. That is a good idea.

yeah it's probably not safe to resume right now, you could have a bit of data loss each time as it expects to later resume from the current ram state.

As far as I know, there is no way to know if a system should resume at boot time, other than the passed kernel cmdline parameters. It's safest to prevent booting if that was passed but cannot be performed.


I've gotten some help looking into this, and I think it is probably safe to boot if it can't resume, and the device is found. If the resume source device cannot be found, something is wrong and booting will stop (in the current form the resume module takes)

julie-de-ville commented 2 days ago

As for path/uuid, I have my resume parameter set to the mapped, decrypted luks partition, at /dev/mapper/gentoo-root. Btw, I am using btrfs on an encrypted luks partition, and for S5 I suspend to a swapfile on the root subvolume, if that helps. Also, I had to set the resume_offset as a parameter as well, since I am using a swapfile.

desultory commented 2 days ago

As for path/uuid, I have my resume parameter set to the mapped, decrypted luks partition, at /dev/mapper/gentoo-root. Btw, I am using btrfs on an encrypted luks partition, and for S5 I suspend to a swapfile on the root subvolume, if that helps. Also, I had to set the resume_offset as a parameter as well, since I am using a swapfile.

resume should be set to your swap partition, the support I just added only supports plain swap, I may add support for encrypted swap too.

As it is, it will boot normally if it cant resume using the provided resume path (using a partuuid is best), if it cannot find the source device, it will enter a fail state.

julie-de-ville commented 2 days ago

Oh I see, my swap is encrypted so I don't think I will be able to test it safely, though I would really like to get it working, so if there is anything I can do to help lmk. I tried to get it working with dracut by adding the crypt and resume modules, and including /etc/crypttab, but it hung on a black screen with a spinning wheel.

desultory commented 2 days ago

Oh I see, my swap is encrypted so I don't think I will be able to test it safely, though I would really like to get it working, so if there is anything I can do to help lmk. I tried to get it working with dracut by adding the crypt and resume modules, and including /etc/crypttab, but it hung on a black screen with a spinning wheel.

Using encrypted swap is somewhat complex. I'd have to add a new method specifically for opening that, which can run first. The real tricky part is that would likely need to attempt to run on every boot. When you're booting fresh, that will just be a waste of time because it will not be able to resume.

you mentioned /dev/mapper/gentoo-root as being your resume device, is that your root partition? Do you have a separate partition that is luks encrypted, or are you using lvm? resuming from swap files is especially difficult because the file offset on the disk must be set.

As for path/uuid, I have my resume parameter set to the mapped, decrypted luks partition, at /dev/mapper/gentoo-root. Btw, I am using btrfs on an encrypted luks partition, and for S5 I suspend to a swapfile on the root subvolume, if that helps. Also, I had to set the resume_offset as a parameter as well, since I am using a swapfile.

Are you sure the resume_offset you found is correct? That is my only assumption why dracut may fail, unless it doesn't properly support hibernation from luks devices. Did it ever ask for the key for your root device? I really hesitate to even open luks devices because I'm not sure if opening them has any chance at writing anything. If you touch storage devices at all between hibernation and resuming, that be may harmful.

desultory commented 2 days ago

this isn't really the best UX, but maybe you could take advantage of the fact that it fails, and then choose whether you want to manually run "crypt_init" which will run the cryptsetup unlock procedure or tell it to ignore resuming. Then you can exit the recovery shell, and on the second pass it will see the resume source and use that.