NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
17.12k stars 13.41k forks source link

NixOS: Handle lvm cached volumes properly #15516

Open viric opened 8 years ago

viric commented 8 years ago

There is some work to do to get proper activation of lvm cached volumes in initrd. I currently have a rootfs cached lv, so I care.

When activating the lv (vgchange -a y in our stage1) it wants to call /usr/sbin/cache_check (from thin_provisioning_tools) or it refuses to activate it. I tested this on stage1, but may affect stage2 as well.

Currently I implemented a quick hack in this commit: https://github.com/viric/nixpkgs/commit/497d5e0b1fd5be2af39b91a3f2fe1df7b5d9c85d

But there may be other options:

A new nixos module would help, too, that included the modules "dm-cache", "dm-cache-smq", "dm-cache-mq", "dm-cache-cleaner" to stage 1.

What do you think? How to proceed? @aszlig @domenkozar @edolstra

marcinfalkiewicz commented 8 years ago

If we are at it, you could expand it to thin* and era* tools (for dm-thin-pool and dm-era modules), both provided by thin_provisioning_tools.

viric commented 7 years ago

The #14394 fix broke my setup and I could not boot NixOS 17.03. I used packageOverrides on lvm2 to get it working again.

1pakch commented 6 years ago

As @dwe11er pointed out the same issue occurs when vgchange -a y stumbles upon thin pools in stage 1 and wants to call thin_check from the same package.

1pakch commented 6 years ago

I think a clean solution would be to create an config option that would hold the list of executables and their aliases needed in stage1. The build script would copy (or, optionally, link them from $targetRoot) to the image making them accesible from stage1.

Looking at filesystem support modules it looks like a common pattern that asks to be refactored out. It would also allow users to easily add the programs they wish to stage1 (e.g. for troubleshooting).

Ekleog commented 6 years ago

Currently having what I believe is the same issue: adding

boot.initrd.kernelModules = [ "dm-thin-pool" ];

triggers

stage-1-init: /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-thin-provisioning-tools-0.6.3/bin/thin_check: execvp failed: No such file or directory

I guess that's due to this nuke-refs.

I wonder if adding -e ${thin-provisioning-tools} or equivalent to the nuke-refs call would solve the issue, though unfortunately won't have much time to test for the time being.

isomarcte commented 6 years ago

I am also having the same issue as @Ekleog. Are there any workarounds for this?

Ekleog commented 6 years ago

As I don't have an “essential for boot” filesystem on a thin volume, my workaround up to now (until I or someone else will have time to look into this better) has been adding options = [ "nofail" ]; to my fileSystems."/fubar", and running vgchange -ay on each boot to bring them up. I guess it'd be possible to add the vgchange -ay directly in a systemd service, but I avoided doing it in order to make sure I don't forget to look into this sometime :)

SnVIZQ commented 6 years ago

I use the trick posted at #nixos IRC channel quite some time ago. The post contained the link to the configuration.nix snippet. It puts the necessary stuff to the initrd image so that lvmcache(7) works:

  boot.initrd.extraUtilsCommands = ''
    # Put thin-provisioning-tools into extra-utils and patch lvm accordingly.
    # NOTE: this works only because thin-provisioning-tools string, including
    # version, is longer than extra-utils string. The difference is zeroed. If
    # it would be vice versa there is a chance it would not work because the 
    # stuff after the full path to the tool would be overwritten. Although there
    # seem to be some other, documentation, string just behind the full path
    # name which might not be that important... Anyways, not spending time
    # to figure out how to avoid the patching in case it is not possible doing
    # the proper way.
    for BIN in ${pkgs.thin-provisioning-tools}/bin/*; do
      copy_bin_and_libs $BIN
      SRC="(?<all>/[a-zA-Z0-9/]+/[0-9a-z]{32}-[0-9a-z-.]+(?<exe>/bin/$(basename $BIN)))"
      REP="\"$out\" . \$+{exe} . \"\\x0\" x (length(\$+{all}) - length(\"$out\" . \$+{exe}))"
      PRP="s,$SRC,$REP,ge"
      ${pkgs.perl}/bin/perl -p -i -e "$PRP" $out/bin/lvm
    done
  ''; 
  boot.initrd.extraUtilsCommandsTest = ''
    # The thin-provisioning-tools use pdata_tools binary as a link target of
    # supported utils so it is enough to check only one, the others should
    # "just" work...
    $out/bin/pdata_tools cache_check -V
  '';
  boot.initrd.availableKernelModules = [ "xhci_pci" "ahci" "nvme" "usb_storage" "usbhid" "sd_mod" "dm_persistent_data" "dm_bio_prison" "dm_bufio" "libcrc32c" "crc32c_generic" "dm_cache_smq" ];
  boot.initrd.kernelModules = [ "dm_cache" ];
YorikSar commented 5 years ago

I've implemented a proper addition of these tools to initrd at #46541. Whoever is interested in this, please review.

tbsmoest commented 4 years ago

What is the status here?

arttuys commented 4 years ago

Anything new so far?

I ran into this same issue, and implemented an ad-hoc patch on my system using YorikSar's workaround here. The workaround has worked fine so far, but ideally I'd use an official solution as to avoid accidentally breaking it during an upgrade.

stale[bot] commented 3 years ago

Hello, I'm a bot and I thank you in the name of the community for opening this issue.

To help our human contributors focus on the most-relevant reports, I check up on old issues to see if they're still relevant. This issue has had no activity for 180 days, and so I marked it as stale, but you can rest assured it will never be closed by a non-human.

The community would appreciate your effort in checking if the issue is still valid. If it isn't, please close it.

If the issue persists, and you'd like to remove the stale label, you simply need to leave a comment. Your comment can be as simple as "still important to me". If you'd like it to get more attention, you can ask for help by searching for maintainers and people that previously touched related code and @ mention them in a comment. You can use Git blame or GitHub's web interface on the relevant files to find them.

Lastly, you can always ask for help at our Discourse Forum or at #nixos' IRC channel.

Thesola10 commented 2 years ago

Not stale, still can't boot my lvm-cache system unattended

cizra commented 2 years ago

@Thesola10 I can - here's what I needed to add:

# get rid of scary warning about missing cache_check
services.lvm.boot.thin.enable = true;

# if you don't have enough kernel modules, you'll get this error message:
# cache: Error creating cache's policy
boot.initrd.kernelModules = [ "dm-cache" "dm-cache-smq" "dm-cache-mq" "dm-cache-cleaner" ];
boot.kernelModules = [ "kvm-amd" "dm-cache" "dm-cache-smq" "dm-persistent-data" "dm-bio-prison" "dm-clone" "dm-crypt" "dm-writecache" "dm-mirror" "dm-snapshot"];

# You need to specify preLVM = false, as it defaults to true
boot.initrd.luks.devices."decrypted".preLVM = false;
Thesola10 commented 2 years ago

@cizra yep, eventually found the relevant modules on a different question about Ubuntu, thanks for the heads-up though.

I still think there should be config options for these module sets (boot.lvm.enableCache maybe?) -- after all, isn't it part of the point of NixOS?