Open wuast94 opened 3 months ago
/var/log/ceph is there has osd logs ?
[2024-03-13 13:19:12,115][ceph_volume.main][INFO ] Running command: ceph-volume lvm trigger 1-18b2426f-90d1-4992-847c-a52b7ef19dc7
[2024-03-13 13:19:12,120][ceph_volume.util.system][WARNING] Executable lvs not found on the host, will return lvs as-is
[2024-03-13 13:19:12,120][ceph_volume.process][INFO ] Running command: lvs --noheadings --readonly --separator=";" -a --units=b --nosuffix -S tags={ceph.osd_id=1,ceph.osd_fsid=18b2426f-90d1-4992-847c-a52b7ef19dc7} -o lv_tags,lv_path,lv_name,vg_name,lv_uuid,lv_size
[2024-03-13 13:19:12,151][ceph_volume.main][INFO ] Running command: ceph-volume lvm trigger 2-611efe97-8305-4a23-9559-33dd95bce599
[2024-03-13 13:19:12,154][ceph_volume.util.system][WARNING] Executable lvs not found on the host, will return lvs as-is
[2024-03-13 13:19:12,154][ceph_volume.process][INFO ] Running command: lvs --noheadings --readonly --separator=";" -a --units=b --nosuffix -S tags={ceph.osd_id=2,ceph.osd_fsid=611efe97-8305-4a23-9559-33dd95bce599} -o lv_tags,lv_path,lv_name,vg_name,lv_uuid,lv_size
[2024-03-13 13:19:12,192][ceph_volume][ERROR ] exception caught by decorator
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/ceph_volume/decorators.py", line 59, in newfunc
return f(*a, **kw)
^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/ceph_volume/main.py", line 153, in main
terminal.dispatch(self.mapper, subcommand_args)
File "/usr/lib/python3/dist-packages/ceph_volume/terminal.py", line 194, in dispatch
instance.main()
File "/usr/lib/python3/dist-packages/ceph_volume/devices/lvm/main.py", line 46, in main
terminal.dispatch(self.mapper, self.argv)
File "/usr/lib/python3/dist-packages/ceph_volume/terminal.py", line 194, in dispatch
instance.main()
File "/usr/lib/python3/dist-packages/ceph_volume/decorators.py", line 16, in is_root
return func(*a, **kw)
^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/ceph_volume/devices/lvm/trigger.py", line 70, in main
Activate(['--auto-detect-objectstore', osd_id, osd_uuid]).main()
File "/usr/lib/python3/dist-packages/ceph_volume/devices/lvm/activate.py", line 281, in main
self.activate(args)
File "/usr/lib/python3/dist-packages/ceph_volume/decorators.py", line 16, in is_root
return func(*a, **kw)
^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/ceph_volume/devices/lvm/activate.py", line 197, in activate
raise RuntimeError('could not find osd.%s with osd_fsid %s' %
RuntimeError: could not find osd.1 with osd_fsid 18b2426f-90d1-4992-847c-a52b7ef19dc7
[2024-03-13 13:19:12,220][ceph_volume][ERROR ] exception caught by decorator
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/ceph_volume/decorators.py", line 59, in newfunc
return f(*a, **kw)
^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/ceph_volume/main.py", line 153, in main
terminal.dispatch(self.mapper, subcommand_args)
File "/usr/lib/python3/dist-packages/ceph_volume/terminal.py", line 194, in dispatch
instance.main()
File "/usr/lib/python3/dist-packages/ceph_volume/devices/lvm/main.py", line 46, in main
terminal.dispatch(self.mapper, self.argv)
File "/usr/lib/python3/dist-packages/ceph_volume/terminal.py", line 194, in dispatch
instance.main()
File "/usr/lib/python3/dist-packages/ceph_volume/decorators.py", line 16, in is_root
return func(*a, **kw)
^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/ceph_volume/devices/lvm/trigger.py", line 70, in main
Activate(['--auto-detect-objectstore', osd_id, osd_uuid]).main()
File "/usr/lib/python3/dist-packages/ceph_volume/devices/lvm/activate.py", line 281, in main
self.activate(args)
File "/usr/lib/python3/dist-packages/ceph_volume/decorators.py", line 16, in is_root
return func(*a, **kw)
^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/ceph_volume/devices/lvm/activate.py", line 197, in activate
raise RuntimeError('could not find osd.%s with osd_fsid %s' %
RuntimeError: could not find osd.2 with osd_fsid 611efe97-8305-4a23-9559-33dd95bce599
[2024-03-13 13:19:12,365][ceph_volume.main][INFO ] Running command: ceph-volume lvm trigger 2-8331d767-af24-40da-bac0-ccbaf0fcda92
[2024-03-13 13:19:12,368][ceph_volume.util.system][WARNING] Executable lvs not found on the host, will return lvs as-is
[2024-03-13 13:19:12,368][ceph_volume.process][INFO ] Running command: lvs --noheadings --readonly --separator=";" -a --units=b --nosuffix -S tags={ceph.osd_id=2,ceph.osd_fsid=8331d767-af24-40da-bac0-ccbaf0fcda92} -o lv_tags,lv_path,lv_name,vg_name,lv_uuid,lv_size
[2024-03-13 13:19:12,428][ceph_volume.process][INFO ] stdout ceph.block_device=/dev/ceph-b9cc563f-5758-4ead-bbec-74c6aafb7099/osd-block-8331d767-af24-40da-bac0-ccbaf0fcda92,ceph.block_uuid=OUftbF-UGG7-RZfB-tgrn-2KtY-JJe4-5RT0jM,ceph.cephx_lockbox_secret=,ceph.cluster_fsid=594dd1f3-8f66-4a84-bb9b-ab7b6437e739,ceph.cluster_name=ceph,ceph.crush_device_class=,ceph.encrypted=0,ceph.osd_fsid=8331d767-af24-40da-bac0-ccbaf0fcda92,ceph.osd_id=2,ceph.osdspec_affinity=,ceph.type=block,ceph.vdo=0";"/dev/ceph-b9cc563f-5758-4ead-bbec-74c6aafb7099/osd-block-8331d767-af24-40da-bac0-ccbaf0fcda92";"osd-block-8331d767-af24-40da-bac0-ccbaf0fcda92";"ceph-b9cc563f-5758-4ead-bbec-74c6aafb7099";"OUftbF-UGG7-RZfB-tgrn-2KtY-JJe4-5RT0jM";"2000364240896
[2024-03-13 13:19:12,428][ceph_volume.devices.lvm.activate][INFO ] auto detecting objectstore
[2024-03-13 13:19:12,432][ceph_volume.devices.lvm.activate][DEBUG ] Found block device (osd-block-8331d767-af24-40da-bac0-ccbaf0fcda92) with encryption: False
[2024-03-13 13:19:12,432][ceph_volume.devices.lvm.activate][DEBUG ] Found block device (osd-block-8331d767-af24-40da-bac0-ccbaf0fcda92) with encryption: False
[2024-03-13 13:19:12,432][ceph_volume.process][INFO ] Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-2
[2024-03-13 13:19:12,433][ceph_volume.process][INFO ] Running command: /usr/bin/ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev /dev/ceph-b9cc563f-5758-4ead-bbec-74c6aafb7099/osd-block-8331d767-af24-40da-bac0-ccbaf0fcda92 --path /var/lib/ceph/osd/ceph-2 --no-mon-config
[2024-03-13 13:19:12,464][ceph_volume.process][INFO ] stderr failed to read label for /dev/ceph-b9cc563f-5758-4ead-bbec-74c6aafb7099/osd-block-8331d767-af24-40da-bac0-ccbaf0fcda92: (2) No such file or directory
2024-03-13T13:19:12.460+0100 7fb6a1a040 -1 bluestore(/dev/ceph-b9cc563f-5758-4ead-bbec-74c6aafb7099/osd-block-8331d767-af24-40da-bac0-ccbaf0fcda92) _read_bdev_label failed to open /dev/ceph-b9cc563f-5758-4ead-bbec-74c6aafb7099/osd-block-8331d767-af24-40da-bac0-ccbaf0fcda92: (2) No such file or directory
[2024-03-13 13:19:12,467][ceph_volume][ERROR ] exception caught by decorator
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/ceph_volume/decorators.py", line 59, in newfunc
return f(*a, **kw)
^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/ceph_volume/main.py", line 153, in main
terminal.dispatch(self.mapper, subcommand_args)
File "/usr/lib/python3/dist-packages/ceph_volume/terminal.py", line 194, in dispatch
instance.main()
File "/usr/lib/python3/dist-packages/ceph_volume/devices/lvm/main.py", line 46, in main
terminal.dispatch(self.mapper, self.argv)
File "/usr/lib/python3/dist-packages/ceph_volume/terminal.py", line 194, in dispatch
instance.main()
File "/usr/lib/python3/dist-packages/ceph_volume/decorators.py", line 16, in is_root
return func(*a, **kw)
^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/ceph_volume/devices/lvm/trigger.py", line 70, in main
Activate(['--auto-detect-objectstore', osd_id, osd_uuid]).main()
File "/usr/lib/python3/dist-packages/ceph_volume/devices/lvm/activate.py", line 281, in main
self.activate(args)
File "/usr/lib/python3/dist-packages/ceph_volume/decorators.py", line 16, in is_root
return func(*a, **kw)
^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/ceph_volume/devices/lvm/activate.py", line 205, in activate
return activate_bluestore(lvs, args.no_systemd)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/ceph_volume/devices/lvm/activate.py", line 112, in activate_bluestore
process.run(prime_command)
File "/usr/lib/python3/dist-packages/ceph_volume/process.py", line 147, in run
raise RuntimeError(msg)
RuntimeError: command returned non-zero exit status: 1\
more context:
lvs --version
LVM version: 2.03.16(2) (2022-05-18)
Library version: 1.02.185 (2022-05-18)
Driver version: 4.48.0
Configuration: ./configure --build=aarch64-linux-gnu --prefix=/usr --includedir=${prefix}/include --mandir=${prefix}/share/man --infodir=${prefix}/share/info --sysconfdir=/etc --localstatedir=/var --disable-option-checking --disable-silent-rules --libdir=${prefix}/lib/aarch64-linux-gnu --runstatedir=/run --disable-maintainer-mode --disable-dependency-tracking --libdir=/lib/aarch64-linux-gnu --sbindir=/sbin --with-usrlibdir=/usr/lib/aarch64-linux-gnu --with-optimisation=-O2 --with-cache=internal --with-device-uid=0 --with-device-gid=6 --with-device-mode=0660 --with-default-pid-dir=/run --with-default-run-dir=/run/lvm --with-default-locking-dir=/run/lock/lvm --with-thin=internal --with-thin-check=/usr/sbin/thin_check --with-thin-dump=/usr/sbin/thin_dump --with-thin-repair=/usr/sbin/thin_repair --with-udev-prefix=/ --enable-applib --enable-blkid_wiping --enable-cmdlib --enable-dmeventd --enable-editline --enable-lvmlockd-dlm --enable-lvmlockd-sanlock --enable-lvmpolld --enable-notify-dbus --enable-pkgconfig --enable-udev_rules --enable-udev_sync --disable-readline
lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sda 8:0 0 1.8T 0 disk
nvme0n1 259:0 0 238.5G 0 disk
vgchange -ay
1 logical volume(s) in volume group "ceph-b9cc563f-5758-4ead-bbec-74c6aafb7099" now active
lsblk
after vgchange -ay
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sda 8:0 0 1.8T 0 disk
└─ceph--b9cc563f--5758--4ead--bbec--74c6aafb7099-osd--block--8331d767--af24--40da--bac0--ccbaf0fcda92 254:0 0 1.8T 0 lvm
/var/lib/ceph/osd/ceph-2
is empty
Found a workaround:
After restart executing
vgchange -ay
activates the Logical Volumes and then all the automations take over
IF the restart was longer ago and the automations run into problems running ceph-volume lvm activate --all
afterwards brings the OSD back up again
adding
@reboot /usr/sbin/vgchange -ay >> /var/log/vgchange.log 2>&1
to my crontab fixes the issue for me
this is a workaround that fixes my specific error, but i think there is something off that also impacts hot plugging etc.
i hope all my information helps to get this fixed 😊
Describe the bug When i Create an Ceph OSD it works without problems, as soon as i reboot the node the OSD wont come back up. To Reproduce Steps to reproduce the behavior:
ENV (please complete the following information):
Additional context
systemctl status ceph-osd@*.service dont give back anything also journalctl -xeu ceph-osd@2.service no entries
i double checked that i am on the right host and use the right OSD number
output of OSD install: