lvmteam / lvm2

Mirror of upstream LVM2 repository
https://gitlab.com/lvmteam/lvm2
GNU General Public License v2.0
132 stars 73 forks source link

lv not available after reboot #29

Closed kevinwd closed 3 months ago

kevinwd commented 4 years ago

lv not available after reboot,how to solve this problem???

After bring up, I can use “vgchange -ay vg0“ command to solve this problem manually,Is there any way to solve this problem automatically??

LVM version: LVM version: 2.03.10(2)-git (2020-03-26) Library version: 1.02.173-git (2020-03-26) Driver version: 4.35.0 Configuration: ./configure

cdegroot commented 3 years ago

FWIW, I had the same today. I split up root in root, home and var (with different raid options), and var never comes up automatically, while home is fine. A vgchange fixes it but currently - until I have time for a deep dive - I can only boot through a rescue boot and this manual action.

Ubuntu 20.04, lvm versions: ```$ dpkg -l|grep -i lvm2 ii liblvm2cmd2.03:amd64 2.03.07-1ubuntu1 amd64 LVM2 command library ii lvm2 2.03.07-1ubuntu1 amd64 Linux Logical Volume Manager

johnnybubonic commented 3 years ago

Affected here as well on Arch on version 2.03.11.

I cannot boot without manual intervention.

I run a single PV (an md device, which is assembled fine during boot), a single VG, and four LVs:

# pvdisplay
  --- Physical volume ---
  PV Name               /dev/md126
  VG Name               vg_md_data
  PV Size               <8.19 TiB / not usable 6.00 MiB
  Allocatable           yes (but full)
  PE Size               4.00 MiB
  Total PE              2146137
  Free PE               0
  Allocated PE          2146137
  PV UUID               39P9ip-784q-cbhx-x4Bd-jAUn-2aVS-OY8nkA
# vgdisplay
  --- Volume group ---
  VG Name               vg_md_data
  System ID             
  Format                lvm2
  Metadata Areas        2
  Metadata Sequence No  7
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                4
  Open LV               4
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               <8.19 TiB
  PE Size               4.00 MiB
  Total PE              2146137
  Alloc PE / Size       2146137 / <8.19 TiB
  Free  PE / Size       0 / 0   
  VG UUID               njDiqo-D6qA-GxWl-rkVR-DzLd-5M8M-l21cvm
# lvdisplay
  --- Logical volume ---
  LV Path                /dev/vg_md_data/data_lv_root
  LV Name                data_lv_root
  VG Name                vg_md_data
  LV UUID                goMGj1-Fxvi-0ub2-KOT2-mHe6-UnYj-fkKsQ1
  LV Write Access        read/write
  LV Creation host, time [REDACTED.FQDN.TLD], 2021-01-05 23:07:04 -0500
  LV Status              available
  # open                 1
  LV Size                10.00 GiB
  Current LE             2560
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     6144
  Block device           254:0

  --- Logical volume ---
  LV Path                /dev/vg_md_data/data_lv_home
  LV Name                data_lv_home
  VG Name                vg_md_data
  LV UUID                xpY3Jh-oDlC-toOQ-qI3B-u7r3-9Yfp-wrZCXn
  LV Write Access        read/write
  LV Creation host, time [REDACTED.FQDN.TLD], 2021-01-05 23:07:29 -0500
  LV Status              available
  # open                 1
  LV Size                3.01 TiB
  Current LE             789504
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     6144
  Block device           254:1

  --- Logical volume ---
  LV Path                /dev/vg_md_data/data_lv_var
  LV Name                data_lv_var
  VG Name                vg_md_data
  LV UUID                AT2OuK-HtFm-rB1n-ZrFF-afok-0GSw-nobY22
  LV Write Access        read/write
  LV Creation host, time [REDACTED.FQDN.TLD], 2021-01-05 23:07:45 -0500
  LV Status              available
  # open                 1
  LV Size                515.00 GiB
  Current LE             131840
  Segments               2
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     6144
  Block device           254:2

  --- Logical volume ---
  LV Path                /dev/vg_md_data/data_lv_opt
  LV Name                data_lv_opt
  VG Name                vg_md_data
  LV UUID                5stzDn-fRuK-04ib-71wN-EH5E-Hgdr-TmDeaS
  LV Write Access        read/write
  LV Creation host, time [REDACTED.FQDN.TLD], 2021-01-05 23:08:30 -0500
  LV Status              available
  # open                 1
  LV Size                4.66 TiB
  Current LE             1222233
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     6144
  Block device           254:3

I always have a failed unit on boot, lvm2-pvscan@9:126.service, for seemingly no reason:

# systemctl status lvm2-pvscan@9\:126.service 
● lvm2-pvscan@9:126.service - LVM event activation on device 9:126
     Loaded: loaded (/usr/lib/systemd/system/lvm2-pvscan@.service; static)
     Active: failed (Result: signal) since Sat 2021-03-27 18:24:49 EDT; 51min ago
       Docs: man:pvscan(8)
   Main PID: 505 (code=killed, signal=TERM)

Mar 27 18:24:48 archlinux systemd[1]: Starting LVM event activation on device 9:126...
Mar 27 18:24:48 archlinux lvm[505]:   Logging initialised at Sat Mar 27 22:24:48 2021
Mar 27 18:24:48 archlinux lvm[505]:   Set umask from 0022 to 0077
Mar 27 18:24:48 archlinux lvm[505]: pvscan  Creating directory "/run/lock/lvm"
Mar 27 18:24:48 archlinux lvm[505]: pvscan  pvscan[505] PV /dev/md126 online, VG vg_md_data is complete.
Mar 27 18:24:48 archlinux lvm[505]: pvscan  pvscan[505] VG vg_md_data run autoactivation.
Mar 27 18:24:48 archlinux lvm[505]: pvscan  PVID 39P9ip-784q-cbhx-x4Bd-jAUn-2aVS-OY8nkA read from /dev/md126 last written to /dev/md127.
Mar 27 18:24:48 archlinux lvm[505]: pvscan  pvscan[505] VG vg_md_data not using quick activation.
# journalctl --boot -u lvm2-pvscan@9\:126.service
-- Journal begins at Fri 2020-10-30 19:55:00 EDT, ends at Sat 2021-03-27 19:18:13 EDT. --
Mar 27 18:24:48 archlinux systemd[1]: Starting LVM event activation on device 9:126...
Mar 27 18:24:48 archlinux lvm[505]:   Logging initialised at Sat Mar 27 22:24:48 2021
Mar 27 18:24:48 archlinux lvm[505]:   Set umask from 0022 to 0077
Mar 27 18:24:48 archlinux lvm[505]: pvscan  Creating directory "/run/lock/lvm"
Mar 27 18:24:48 archlinux lvm[505]: pvscan  pvscan[505] PV /dev/md126 online, VG vg_md_data is complete.
Mar 27 18:24:48 archlinux lvm[505]: pvscan  pvscan[505] VG vg_md_data run autoactivation.
Mar 27 18:24:48 archlinux lvm[505]: pvscan  PVID 39P9ip-784q-cbhx-x4Bd-jAUn-2aVS-OY8nkA read from /dev/md126 last written to /dev/md127.
Mar 27 18:24:48 archlinux lvm[505]: pvscan  pvscan[505] VG vg_md_data not using quick activation.

As shown, no apparent failures or errors, but the service is still marked as failed (presumably a timeout?).

Interestingly, that is the correct maj:min:

# lsblk /dev/md126
NAME                      MAJ:MIN RM  SIZE RO TYPE   MOUNTPOINT
md126                       9:126  0  8.2T  0 raid10 
├─vg_md_data-data_lv_root 254:0    0   10G  0 lvm    /root
├─vg_md_data-data_lv_home 254:1    0    3T  0 lvm    /home
├─vg_md_data-data_lv_var  254:2    0  515G  0 lvm    /var
└─vg_md_data-data_lv_opt  254:3    0  4.7T  0 lvm    /opt

Any ideas, LVM team? This is driving me insane. I can't even get vgchange -a y to work on boot (it finds my VG and LVs fine, but it never creates mappings for them in either /dev/mapper/ or /dev/<VG_name>/ - normally it does so in both) unless I explicitly assign my VG to auto_activation_volume_list. This was all working fine a little less than a month ago.

fraimondo commented 3 years ago

Same here, randomly happen on boot. Sometimes everything is active, sometimes it is not. I still don't know how to solve it.

lvm version
  LVM version:     2.03.07(2) (2019-11-30)
  Library version: 1.02.167 (2019-11-30)
  Driver version:  4.42.0

lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 20.04.2 LTS
Release:    20.04
Codename:   focal

RAID5: /dev/md126 VG: vgdata LV: tp_data_pool (thin pool) LV: home_athena (on top of thin pool) LUKS encrypted file system

During boot, I can see the following messages:

Jun 02 22:59:44 kronos lvm[2130]:   pvscan[2130] PV /dev/md126 online, VG vgdata is complete.
Jun 02 22:59:44 kronos lvm[2130]:   pvscan[2130] VG vgdata skip autoactivation.

Then this:

Jun 02 22:59:44 kronos systemd[1]: Finished LVM event activation on device 253:0.
Jun 02 22:59:44 kronos systemd[1]: Mounted /boot/efi.
Jun 02 22:59:44 kronos systemd[1]: Finished LVM event activation on device 9:126.
Jun 02 23:00:14 kronos systemd[1]: systemd-fsckd.service: Succeeded.
Jun 02 23:01:10 kronos systemd[1]: dev-disk-by\x2duuid-5773b7da\x2d5b0f\x2d4347\x2db718\x2d377912a6209a.device: Job dev-disk-by\x2duuid-5773b7da\x2d5b0f\x2d4347\x2db718\x2d377912a6209a.device/start timed out.

The disk that times out corresponds to the UUID of the unencrypted file system: /dev/vgdata/home_athena

So the problem is that LVM is not activating the thin-pool and LVs.

After an unsuccessful boot, I see the thin-pool and LV inactive:

lvdisplay 
  --- Logical volume ---
  LV Name                tp_data_pool
  VG Name                vgdata
  LV UUID                3Q4KuX-RoZn-nZ9n-Irk8-PG6Q-ChBu-QV86Fi
  LV Write Access        read/write
  LV Creation host, time kronos, 2021-06-01 23:35:13 +0200
  LV Pool metadata       tp_data_pool_tmeta
  LV Pool data           tp_data_pool_tdata
  LV Status              NOT available
  LV Size                27.65 TiB
  Current LE             7249279
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto

  --- Logical volume ---
  LV Path                /dev/vgdata/home_athena
  LV Name                home_athena
  VG Name                vgdata
  LV UUID                3AwKjA-juIX-lGOB-uqqa-LO4i-5G0v-EMydMe
  LV Write Access        read/write
  LV Creation host, time kronos, 2021-06-01 23:53:14 +0200
  LV Pool name           tp_data_pool
  LV Status              NOT available
  LV Size                100.00 GiB
  Current LE             25600
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto

  --- Logical volume ---
  LV Path                /dev/vgxubuntu/root
  LV Name                root
  VG Name                vgxubuntu
  LV UUID                hseqPL-yxW0-3NBe-8SUc-zaX5-g7CI-LUUgmT
  LV Write Access        read/write
  LV Creation host, time xubuntu, 2021-05-29 13:05:46 +0200
  LV Status              available
  # open                 1
  LV Size                929.32 GiB
  Current LE             237907
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:1

  --- Logical volume ---
  LV Path                /dev/vgxubuntu/swap_1
  LV Name                swap_1
  VG Name                vgxubuntu
  LV UUID                TSy0SB-N0r8-TRrP-9BZE-7B6k-GeJ5-NGMXsg
  LV Write Access        read/write
  LV Creation host, time xubuntu, 2021-05-29 13:05:46 +0200
  LV Status              available
  # open                 2
  LV Size                976.00 MiB
  Current LE             244
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:2

If I run lvchange -a y vgdata/tp_data_pool, this activates the thin pool:

lvdisplay 
  --- Logical volume ---
  LV Name                tp_data_pool
  VG Name                vgdata
  LV UUID                3Q4KuX-RoZn-nZ9n-Irk8-PG6Q-ChBu-QV86Fi
  LV Write Access        read/write (activated read only)
  LV Creation host, time kronos, 2021-06-01 23:35:13 +0200
  LV Pool metadata       tp_data_pool_tmeta
  LV Pool data           tp_data_pool_tdata
  LV Status              available
  # open                 1
  LV Size                27.65 TiB
  Allocated pool data    0.01%
  Allocated metadata     10.42%
  Current LE             7249279
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     1024
  Block device           253:5

  --- Logical volume ---
  LV Path                /dev/vgdata/home_athena
  LV Name                home_athena
  VG Name                vgdata
  LV UUID                3AwKjA-juIX-lGOB-uqqa-LO4i-5G0v-EMydMe
  LV Write Access        read/write
  LV Creation host, time kronos, 2021-06-01 23:53:14 +0200
  LV Pool name           tp_data_pool
  LV Status              NOT available
  LV Size                100.00 GiB
  Current LE             25600
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto

Then I need to activate the other LV: lvchange -a y vgdata/home_athena

lvdisplay
  --- Logical volume ---
  LV Name                tp_data_pool
  VG Name                vgdata
  LV UUID                3Q4KuX-RoZn-nZ9n-Irk8-PG6Q-ChBu-QV86Fi
  LV Write Access        read/write (activated read only)
  LV Creation host, time kronos, 2021-06-01 23:35:13 +0200
  LV Pool metadata       tp_data_pool_tmeta
  LV Pool data           tp_data_pool_tdata
  LV Status              available
  # open                 2
  LV Size                27.65 TiB
  Allocated pool data    0.01%
  Allocated metadata     10.42%
  Current LE             7249279
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     1024
  Block device           253:5

  --- Logical volume ---
  LV Path                /dev/vgdata/home_athena
  LV Name                home_athena
  VG Name                vgdata
  LV UUID                3AwKjA-juIX-lGOB-uqqa-LO4i-5G0v-EMydMe
  LV Write Access        read/write
  LV Creation host, time kronos, 2021-06-01 23:53:14 +0200
  LV Pool name           tp_data_pool
  LV Status              available
  # open                 1
  LV Size                100.00 GiB
  Mapped size            3.30%
  Current LE             25600
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     1024
  Block device           253:7

and finally deal with the encryption cryptdisks_start home_athena.

Posible solution: Use event_activation=0 in lvm.conf. At least after changing that parameter it worked, twice.

teigland commented 3 years ago

In each of the cases above, please collect the debug logging from the lvm commands. In the lvm.conf log{} section, set level=7 and file="/tmp/lvm.log" and send or post the log file for us to analyze. Or, add -vvvv (four v's) to the commands and collect the output. Thanks.

fraimondo commented 3 years ago

Is the output of lvmdump good enough? I can try to rollback the fix and run that command before starting the fix.

On Thu, 3 Jun 2021, 17:19 David Teigland, @.***> wrote:

In each of the cases above, please collect the debug logging from the lvm commands. In the lvm.conf log{} section, set level=7 and file="/tmp/lvm.log" and send or post the log file for us to analyze. Or, add -vvvv (four v's) to the commands and collect the output. Thanks.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/lvmteam/lvm2/issues/29#issuecomment-853951246, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABCJDA4FJAX4VHXV53IULCLTQ6MQHANCNFSM4M2CWM2Q .

teigland commented 3 years ago

Is the output of lvmdump good enough? I can try to rollback the fix and run that command before starting the fix.

no, we need debugging from the pvscan commands that are run by the lvm2-pvscan services.

fraimondo commented 3 years ago

I managed to generate log files for the 3 different boot logs:

  1. with event_activation = 0: lvm_noevent.log

  2. with event_activation = 1 and working properly: lvm_event_working.log

  3. with event_activation = 1 and not working: lvm_event_broken.log

Let me know if I can test something else for you. This system is going on production on Monday and I won't be able to test more after that.

teigland commented 3 years ago

lvm_event_broken.log

It appears that on your system the /run/lvm/ files may be persistent across boots, specifically the files in /run/lvm/pvs_online/ and /run/lvm/vgs_online/. For event-based autoactivation, pvscan requires that /run/lvm be cleared by reboot.

pharhp commented 2 years ago

Having a similar issue. I had to replace the DIMMs in my server and then struggled to upgrade Ubuntu. Presently having an issue with one of the lvs not being active after a reboot.

Here is a boot log as requested above. lvm.log

Thank You!

fpc7063 commented 10 months ago

Same situation going on Debian Bookworm

LVM version: 2.03.16(2) (2022-05-18)
Library version: 1.02.185 (2022-05-18)
Driver version: 4.47.0
Configuration: ./configure --build=x86_64-linux-gnu --prefix=/usr --includedir=${prefix}/include --mandir=${prefix}/share/man --infodir=${prefix}/share/info --sysconfdir=/etc --localstatedir=/var --disable-option-checking --disable-silent-rules --libdir=${prefix}/lib/x86_64-linux-gnu --runstatedir=/run --disable-maintainer-mode --disable-dependency-tracking --libdir=/lib/x86_64-linux-gnu --sbindir=/sbin --with-usrlibdir=/usr/lib/x86_64-linux-gnu --with-optimisation=-O2 --with-cache=internal --with-device-uid=0 --with-device-gid=6 --with-device-mode=0660 --with-default-pid-dir=/run --with-default-run-dir=/run/lvm --with-default-locking-dir=/run/lock/lvm --with-thin=internal --with-thin-check=/usr/sbin/thin_check --with-thin-dump=/usr/sbin/thin_dump --with-thin-repair=/usr/sbin/thin_repair --with-udev-prefix=/ --enable-applib --enable-blkid_wiping --enable-cmdlib --enable-dmeventd --enable-editline --enable-lvmlockd-dlm --enable-lvmlockd-sanlock --enable-lvmpolld --enable-notify-dbus --enable-pkgconfig --enable-udev_rules --enable-udev_sync --disable-readline

journalctl

Jan 18 16:12:35 localhost.localdomain systemd[1]: Listening on lvm2-lvmpolld.socket - LVM2 poll daemon socket.
Jan 18 16:12:35 localhost.localdomain systemd[1]: Starting lvm2-monitor.service - Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling...
Jan 18 16:12:35 localhost.localdomain lvm[479]: PV /dev/vdb3 online, VG os-vg is complete.
Jan 18 16:12:35 localhost.localdomain lvm[479]: VG os-vg finished
Jan 18 16:12:35 localhost.localdomain lvm[485]: PV /dev/vda3 online, VG os-vg is complete.
Jan 18 16:12:35 localhost.localdomain lvm[485]: VG os-vg finished

LVM SETUP

  PV         VG    Fmt  Attr PSize   PFree
  /dev/vda3  os-vg lvm2 a--  <15.00g 1.82g
  /dev/vdb3  os-vg lvm2 a--  <15.00g 1.82g
root@live:~# vgs
  VG    #PV #LV #SN Attr   VSize  VFree 
  os-vg   2   1   0 wz--n- 29.99g <3.65g
root@live:~# lvs
  LV    VG    Attr       LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  os-lv os-vg rwi---r--- 13.00g

Log for pvscan

When I run update-grub I get some strange output

Generating grub configuration file ...
error: unknown node 'os-lv_rimage_0'.     (x12)
Found linux image: /boot/vmlinuz-6.1.0-17-amd64
Found initrd image: /boot/initrd.img-6.1.0-17-amd64
error: unknown node 'os-lv_rimage_0'.     (x12)
Warning: os-prober will not be executed to detect other bootable partitions.
Systems on them will not be added to the GRUB boot configuration.
Check GRUB_DISABLE_OS_PROBER documentation entry.
Adding boot menu entry for UEFI Firmware Settings ...
done

I have a couple of scripts to replicate the issue in a vm. Doing raid0 works Doing raid1 without --raidintegrity works

teigland commented 10 months ago

Jan 18 16:12:35 localhost.localdomain lvm[479]: PV /dev/vdb3 online, VG os-vg is complete. Jan 18 16:12:35 localhost.localdomain lvm[479]: VG os-vg finished Jan 18 16:12:35 localhost.localdomain lvm[485]: PV /dev/vda3 online, VG os-vg is complete. Jan 18 16:12:35 localhost.localdomain lvm[485]: VG os-vg finished

That doesn't look right, maybe the /run/lvm files were not cleared as required (also mentioned this issue above.)

fpc7063 commented 10 months ago

Wouldn't that impact raid1 and raid0 without raidintegrity? The chroot doesn't seem to have a /run/lvm directory when I first create it. Also, when booting from a live iso when using --raidintegrity the lv is inactive. Normal raid1 and raid0 work fine. vgchange os-vg --setautoactivation y returns: Volume Group autoactivation is already yes

fpc7063 commented 10 months ago

There is no files in the iso's /run/lvm but the directory exists

teigland commented 10 months ago

Left-over run files will affect the autoactivation of any VG (LV types shouldn't be relevant.) The "is complete" messages seem to indicate that incorrect temp files exist under /run/lvm/pvs_online/ and /run/lvm/vgs_online/.

Left-over run files could cause the VG to be autoactivated when the VG is still incomplete (some PVs aren't yet available). That's an unsolved problem, which is why autoactivation always waits for the VG to be complete. If you attempt to autoactivate an incomplete VG, and the VG has raid LVs, it means autoactivation may attempt to activate the raid LVs in degraded mode shortly before all PVs become available. This is part of what makes this an unsolved problem, and may explain some of the issues you're seeing.

So, to sort out this problem, you need to focus on the "PV ... online, VG ... complete" messages, and ensure that's happening correctly. Those are logged by the pvscan commands run from udev rules, and as mentioned earlier, they depend on /run/lvm/pvs_online and vgs_online temp files being cleared at each boot.

To debug the pvscans run by udev rules, you can enable udev debugging, and you can collect debug logging (-vvvv) from those specific pvscan commands.

fpc7063 commented 10 months ago

as you can see in the ISO there isn't such left-over files chroot Since i'm using a iso there is no way it's persisting anything there. The root filesystem of my target installation is on the raid0 lvm /dev/os-vg/os-lv which doesn't mount. And as I said if --raidintegrity is removed or when using raid0 it works as intended

I'm not familiar with udev debugging, would this be enough to provide the necessary info?

echo "udev_log=\"debug\"" >> /etc/udev/udev.conf
sed -i -e 's/# verbose = 0/verbose = 7/' /etc/lvm/lvm.conf

And the log would be essentially journalctl | grep lvm and pvscan -vvvv? bad-journalctl.log bad-pvscan-vvvv.log bad-vgchange-ay.log

\ \ Now without --raidintegrity

good-journalctl.log good-pvscan-vvvv.log good-vgchange-ay.log

I'll try to take a better look at these logs tomorrow

teigland commented 10 months ago

This doesn't look like a normal system boot, e.g. booting a standard RHEL install. If you're not doing that, then whatever it is you are doing is outside of what lvm is designed to do. In a normal RHEL install, the root LV is activated in the initrd (see lvm code in dracut). Then, it switches to the root fs, runs the coldplug service which generates new uevents for each of the disks, which triggers a command "pvscan --cache ..." (run from udev rules) for each PV. Those "pvscan --cache ..." commands create temp files under /run/lvm/pvs_online/ and /run/lvm/vgs_online/. Once all the PVs for the VG are online (based on the run files), the VG is autoactivated, which covers any LVs that were not already activated in the initrd. You can read more about it in https://man7.org/linux/man-pages/man7/lvmautoactivation.7.html

You're attaching standard pvscan and vgchange -ay commands that you've run. This is very different from the pvscan/vgchange commands that are run from udev rules, which are involved in a standard RHEL system boot.

fpc7063 commented 10 months ago

Okay, once i tried troubleshooting it from initramfs it all made sense. First of all it isn't associated with /run/lvm files neither with the race condition issue.

For the sake if someone lands here, the issue is debian doesn't load the module dm_integrity by default, that's why only raids with --raidintegrity y were failing: After landing in initramfs: lvm vchange -ay returns:

/sbin/modprobe failed: 1
Can't process LV os-vg/os-lv_rimage_0: integrity target support missing from kernel?
0 logical volume(s) in volume group "os-vg" now active

In my case the activation of the lvm integrity in the rootfs makes it necessary to load dm_integrity into initramfs

echo "dm_integrity" >> /etc/initramfs-tools/modules
update-initramfs -u

What is kinda of not intuitive is that vgchange -ay loads the module and the udev rules doesn't. Which isn't a problem in other distros where dm_integrity is loaded by default it seems (at least in their iso). Thanks for the help and patience @teigland

zkabelac commented 3 months ago

So closing the issue - if there is still a bug - it's likely a bug for initramfs / dracut tooling to properly add all dm modules into ramdisk.