rook / rook

Storage Orchestration for Kubernetes
https://rook.io
Apache License 2.0
12.32k stars 2.69k forks source link

After the specified OSD is removed, the OSD deployment is still started(but fails). #14621

Open Xunop opened 1 month ago

Xunop commented 1 month ago

Is this a bug report or feature request?

Deviation from expected behavior:

After following the steps in the Rook documentation for removing an OSD, Rook still starts a deployment for the removed OSD. I only removed the OSD but did not remove the disk from the server.

Expected behavior:

When I remove an OSD from the server but leave the disk in place, Rook should recognize that I no longer need this OSD and should not attempt to start its deployment.

How to reproduce it (minimal and precise):

  1. Follow the Rook quickstart guide to set up Rook.
  2. Add an OSD and then delete it according to the documentation after the OSD pod has started.
  3. Check the pod status and observe that a deployment of the deleted OSD is still started.

More infomation:

I spent some time going through the source code and then realized that this behavior is related to this code: https://github.com/rook/rook/blob/master/pkg/daemon/ceph/osd/volume.go#L107. Since I only removed an OSD and did not add a new one, the getAvailableDevices function skips the devices, resulting in an empty device array being passed to the configureCVDevices function. This function retrieves OSD information using the Ceph command ceph-volume lvm|raw list, which still returns information for all OSDs because Ceph identifies OSDs by checking the block signatures. As a result, it retrieves information for OSDs that have already been deleted. Perhaps we should exclude the OSDs that have been deleted from the OSDs that Ceph retrieves, or consider wiping the Ceph signature when an OSD is removed?

OSD prepare pod log:

After delete OSD /dev/loop1:

Defaulted container "provision" out of: provision, copy-bins (init)
2024/08/21 07:22:25 maxprocs: Leaving GOMAXPROCS=20: CPU quota undefined
2024-08-21 07:22:25.236739 I | cephcmd: desired devices to configure osds: [{Name:loop0 OSDsPerDevice:1 MetadataDevice: DatabaseSizeMB:0 DeviceClass: InitialWeight: IsFilter:false IsDevicePathFilter:false}]
2024-08-21 07:22:25.236953 I | rookcmd: starting Rook v1.15.0-alpha.0.30.g5f98d2ea3-dirty with arguments '/rook/rook ceph osd provision'
2024-08-21 07:22:25.236955 I | rookcmd: flag values: --cluster-id=ce91b018-c0e9-4508-871c-b408ef13015c, --cluster-name=rook-ceph, --data-device-filter=, --data-device-path-filter=, --data-devices=[{"id":"loop0","storeConfig":{"osdsPerDevice":1}}], --encrypted-device=false, --force-format=false, --help=false, --location=, --log-level=DEBUG, --metadata-device=, --node-name=minikube, --osd-crush-device-class=, --osd-crush-initial-weight=, --osd-database-size=0, --osd-store-type=bluestore, --osd-wal-size=576, --osds-per-device=1, --pvc-backed-osd=false, --replace-osd=-1
2024-08-21 07:22:25.236957 I | ceph-spec: parsing mon endpoints: a=10.98.108.144:6789
2024-08-21 07:22:25.239852 I | op-osd: CRUSH location=root=default host=minikube
2024-08-21 07:22:25.239855 I | cephcmd: crush location of osd: root=default host=minikube
2024-08-21 07:22:25.240559 D | cephclient: No ceph configuration override to merge as "rook-config-override" configmap is empty
2024-08-21 07:22:25.240565 I | cephclient: writing config file /var/lib/rook/rook-ceph/rook-ceph.config
2024-08-21 07:22:25.240595 I | cephclient: generated admin config in /var/lib/rook/rook-ceph
2024-08-21 07:22:25.240637 D | cephclient: config file @ /etc/ceph/ceph.conf:
[global]
fsid                = e68db21e-a173-4637-9d5f-4bb9476cb93f
mon initial members = a
mon host            = [v2:10.98.108.144:3300,v1:10.98.108.144:6789]

[client.admin]
keyring = /var/lib/rook/rook-ceph/client.admin.keyring
2024-08-21 07:22:25.240642 D | exec: Running command: dmsetup version
2024-08-21 07:22:25.242454 I | cephosd: Library version:   1.02.197 (2023-11-21)
Driver version:    4.48.0
2024-08-21 07:22:25.246161 I | cephosd: discovering hardware
2024-08-21 07:22:25.246169 D | exec: Running command: lsblk --all --noheadings --list --output KNAME
2024-08-21 07:22:25.248818 D | exec: Running command: lsblk /dev/loop0 --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME,KNAME,MOUNTPOINT,FSTYPE
2024-08-21 07:22:25.251376 D | sys: lsblk output: "SIZE=\"10737418240\" ROTA=\"0\" RO=\"0\" TYPE=\"loop\" PKNAME=\"\" NAME=\"/dev/loop0\" KNAME=\"/dev/loop0\" MOUNTPOINT=\"\" FSTYPE=\"\""
2024-08-21 07:22:25.251389 D | exec: Running command: udevadm info --query=property /dev/loop0
2024-08-21 07:22:25.256665 D | sys: udevadm info output: "DEVPATH=/devices/virtual/block/loop0\nDEVNAME=/dev/loop0\nDEVTYPE=disk\nDISKSEQ=3\nMAJOR=7\nMINOR=0\nSUBSYSTEM=block"
2024-08-21 07:22:25.256678 D | exec: Running command: lsblk /dev/loop1 --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME,KNAME,MOUNTPOINT,FSTYPE
2024-08-21 07:22:25.258417 D | sys: lsblk output: "SIZE=\"10737418240\" ROTA=\"0\" RO=\"0\" TYPE=\"loop\" PKNAME=\"\" NAME=\"/dev/loop1\" KNAME=\"/dev/loop1\" MOUNTPOINT=\"\" FSTYPE=\"\""
2024-08-21 07:22:25.258427 D | exec: Running command: udevadm info --query=property /dev/loop1
2024-08-21 07:22:25.260859 D | sys: udevadm info output: "DEVPATH=/devices/virtual/block/loop1\nDEVNAME=/dev/loop1\nDEVTYPE=disk\nDISKSEQ=5\nMAJOR=7\nMINOR=1\nSUBSYSTEM=block"
2024-08-21 07:22:25.260872 D | exec: Running command: lsblk /dev/nvme0n1 --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME,KNAME,MOUNTPOINT,FSTYPE
2024-08-21 07:22:25.262892 D | sys: lsblk output: "SIZE=\"1024209543168\" ROTA=\"0\" RO=\"0\" TYPE=\"disk\" PKNAME=\"\" NAME=\"/dev/nvme0n1\" KNAME=\"/dev/nvme0n1\" MOUNTPOINT=\"\" FSTYPE=\"\""
2024-08-21 07:22:25.262920 D | exec: Running command: sgdisk --print /dev/nvme0n1
2024-08-21 07:22:25.265376 D | exec: Running command: udevadm info --query=property /dev/nvme0n1
2024-08-21 07:22:25.268356 D | sys: udevadm info output: "DEVPATH=/devices/pci0000:00/0000:00:06.0/0000:01:00.0/nvme/nvme0/nvme0n1\nDEVNAME=/dev/nvme0n1\nDEVTYPE=disk\nDISKSEQ=1\nMAJOR=259\nMINOR=0\nSUBSYSTEM=block"
2024-08-21 07:22:25.268367 D | exec: Running command: lsblk --noheadings --path --list --output NAME /dev/nvme0n1
2024-08-21 07:22:25.269449 I | inventory: skipping device "nvme0n1" because it has child, considering the child instead.
2024-08-21 07:22:25.269456 D | exec: Running command: lsblk /dev/nvme0n1p1 --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME,KNAME,MOUNTPOINT,FSTYPE
2024-08-21 07:22:25.271454 D | sys: lsblk output: "SIZE=\"1071644672\" ROTA=\"0\" RO=\"0\" TYPE=\"part\" PKNAME=\"/dev/nvme0n1\" NAME=\"/dev/nvme0n1p1\" KNAME=\"/dev/nvme0n1p1\" MOUNTPOINT=\"\" FSTYPE=\"\""
2024-08-21 07:22:25.271463 D | exec: Running command: udevadm info --query=property /dev/nvme0n1p1
2024-08-21 07:22:25.274367 D | sys: udevadm info output: "DEVPATH=/devices/pci0000:00/0000:00:06.0/0000:01:00.0/nvme/nvme0/nvme0n1/nvme0n1p1\nDEVNAME=/dev/nvme0n1p1\nDEVTYPE=partition\nDISKSEQ=1\nPARTN=1\nMAJOR=259\nMINOR=1\nSUBSYSTEM=block"
2024-08-21 07:22:25.274375 D | exec: Running command: lsblk /dev/nvme0n1p2 --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME,KNAME,MOUNTPOINT,FSTYPE
2024-08-21 07:22:25.276373 D | sys: lsblk output: "SIZE=\"1023136497664\" ROTA=\"0\" RO=\"0\" TYPE=\"part\" PKNAME=\"/dev/nvme0n1\" NAME=\"/dev/nvme0n1p2\" KNAME=\"/dev/nvme0n1p2\" MOUNTPOINT=\"/var/lib/ceph/crash\" FSTYPE=\"\""
2024-08-21 07:22:25.276386 D | exec: Running command: udevadm info --query=property /dev/nvme0n1p2
2024-08-21 07:22:25.278996 D | sys: udevadm info output: "DEVPATH=/devices/pci0000:00/0000:00:06.0/0000:01:00.0/nvme/nvme0/nvme0n1/nvme0n1p2\nDEVNAME=/dev/nvme0n1p2\nDEVTYPE=partition\nDISKSEQ=1\nPARTN=2\nMAJOR=259\nMINOR=2\nSUBSYSTEM=block"
2024-08-21 07:22:25.279001 D | inventory: discovered disks are:
2024-08-21 07:22:25.279015 D | inventory: &{Name:loop0 Parent: HasChildren:false DevLinks: Size:10737418240 UUID: Serial: Type:loop Rotational:false Readonly:false Partitions:[] Filesystem: Mountpoint: Vendor: Model: WWN: WWNVendorExtension: Empty:false CephVolumeData: RealPath:/dev/loop0 KernelName:loop0 Encrypted:false}
2024-08-21 07:22:25.279022 D | inventory: &{Name:loop1 Parent: HasChildren:false DevLinks: Size:10737418240 UUID: Serial: Type:loop Rotational:false Readonly:false Partitions:[] Filesystem: Mountpoint: Vendor: Model: WWN: WWNVendorExtension: Empty:false CephVolumeData: RealPath:/dev/loop1 KernelName:loop1 Encrypted:false}
2024-08-21 07:22:25.279027 D | inventory: &{Name:nvme0n1p1 Parent:nvme0n1 HasChildren:false DevLinks: Size:1071644672 UUID: Serial: Type:part Rotational:false Readonly:false Partitions:[] Filesystem: Mountpoint: Vendor: Model: WWN: WWNVendorExtension: Empty:false CephVolumeData: RealPath:/dev/nvme0n1p1 KernelName:nvme0n1p1 Encrypted:false}
2024-08-21 07:22:25.279032 D | inventory: &{Name:nvme0n1p2 Parent:nvme0n1 HasChildren:false DevLinks: Size:1023136497664 UUID: Serial: Type:part Rotational:false Readonly:false Partitions:[] Filesystem: Mountpoint:crash Vendor: Model: WWN: WWNVendorExtension: Empty:false CephVolumeData: RealPath:/dev/nvme0n1p2 KernelName:nvme0n1p2 Encrypted:false}
2024-08-21 07:22:25.279035 I | cephosd: creating and starting the osds
2024-08-21 07:22:25.279040 D | cephosd: desiredDevices are [{Name:loop0 OSDsPerDevice:1 MetadataDevice: DatabaseSizeMB:0 DeviceClass: InitialWeight: IsFilter:false IsDevicePathFilter:false}]
2024-08-21 07:22:25.279042 D | cephosd: context.Devices are:
2024-08-21 07:22:25.279047 D | cephosd: &{Name:loop0 Parent: HasChildren:false DevLinks: Size:10737418240 UUID: Serial: Type:loop Rotational:false Readonly:false Partitions:[] Filesystem: Mountpoint: Vendor: Model: WWN: WWNVendorExtension: Empty:false CephVolumeData: RealPath:/dev/loop0 KernelName:loop0 Encrypted:false}
2024-08-21 07:22:25.279052 D | cephosd: &{Name:loop1 Parent: HasChildren:false DevLinks: Size:10737418240 UUID: Serial: Type:loop Rotational:false Readonly:false Partitions:[] Filesystem: Mountpoint: Vendor: Model: WWN: WWNVendorExtension: Empty:false CephVolumeData: RealPath:/dev/loop1 KernelName:loop1 Encrypted:false}
2024-08-21 07:22:25.279058 D | cephosd: &{Name:nvme0n1p1 Parent:nvme0n1 HasChildren:false DevLinks: Size:1071644672 UUID: Serial: Type:part Rotational:false Readonly:false Partitions:[] Filesystem: Mountpoint: Vendor: Model: WWN: WWNVendorExtension: Empty:false CephVolumeData: RealPath:/dev/nvme0n1p1 KernelName:nvme0n1p1 Encrypted:false}
2024-08-21 07:22:25.279067 D | cephosd: &{Name:nvme0n1p2 Parent:nvme0n1 HasChildren:false DevLinks: Size:1023136497664 UUID: Serial: Type:part Rotational:false Readonly:false Partitions:[] Filesystem: Mountpoint:crash Vendor: Model: WWN: WWNVendorExtension: Empty:false CephVolumeData: RealPath:/dev/nvme0n1p2 KernelName:nvme0n1p2 Encrypted:false}
2024-08-21 07:22:25.279069 I | cephosd: old lsblk can't detect bluestore signature, so try to detect here
2024-08-21 07:22:25.279139 I | cephosd: skipping device "loop0", detected an existing OSD. UUID=2c5e7746-79f4-4498-90ee-654f56553e77
2024-08-21 07:22:25.279142 I | cephosd: old lsblk can't detect bluestore signature, so try to detect here
2024-08-21 07:22:25.279154 I | cephosd: skipping device "loop1", detected an existing OSD. UUID=0e0eaea9-7d58-4a7f-b1a8-712f0523e718
2024-08-21 07:22:25.279157 I | cephosd: old lsblk can't detect bluestore signature, so try to detect here
2024-08-21 07:22:25.279253 D | exec: Running command: udevadm info --query=property /dev/nvme0n1p1
2024-08-21 07:22:25.281154 D | sys: udevadm info output: "DEVPATH=/devices/pci0000:00/0000:00:06.0/0000:01:00.0/nvme/nvme0/nvme0n1/nvme0n1p1\nDEVNAME=/dev/nvme0n1p1\nDEVTYPE=partition\nDISKSEQ=1\nPARTN=1\nMAJOR=259\nMINOR=1\nSUBSYSTEM=block"
2024-08-21 07:22:25.281162 D | exec: Running command: lsblk /dev/nvme0n1p1 --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME,KNAME,MOUNTPOINT,FSTYPE
2024-08-21 07:22:25.283108 D | sys: lsblk output: "SIZE=\"1071644672\" ROTA=\"0\" RO=\"0\" TYPE=\"part\" PKNAME=\"/dev/nvme0n1\" NAME=\"/dev/nvme0n1p1\" KNAME=\"/dev/nvme0n1p1\" MOUNTPOINT=\"\" FSTYPE=\"\""
2024-08-21 07:22:25.283117 D | exec: Running command: ceph-volume inventory --format json /dev/nvme0n1p1
2024-08-21 07:22:25.438672 I | cephosd: skipping device "nvme0n1p1": ["Has a FileSystem", "Insufficient space (<5GB)"].
2024-08-21 07:22:25.438683 I | cephosd: skipping device "nvme0n1p2" with mountpoint "crash"
2024-08-21 07:22:25.440884 I | cephosd: configuring osd devices: {"Entries":{}}
2024-08-21 07:22:25.440890 I | cephosd: no new devices to configure. returning devices already configured with ceph-volume.
2024-08-21 07:22:25.441006 D | exec: Running command: stdbuf -oL ceph-volume --log-path /tmp/ceph-log lvm list  --format json
2024-08-21 07:22:25.546237 D | cephosd: {}
2024-08-21 07:22:25.546253 I | cephosd: 0 ceph-volume lvm osd devices configured on this node
2024-08-21 07:22:25.546264 D | exec: Running command: stdbuf -oL ceph-volume --log-path /tmp/ceph-log raw list --format json
2024-08-21 07:22:25.717331 D | cephosd: {
    "0e0eaea9-7d58-4a7f-b1a8-712f0523e718": {
        "ceph_fsid": "e68db21e-a173-4637-9d5f-4bb9476cb93f",
        "device": "/dev/loop1",
        "osd_id": 0,
        "osd_uuid": "0e0eaea9-7d58-4a7f-b1a8-712f0523e718",
        "type": "bluestore"
    },
    "2c5e7746-79f4-4498-90ee-654f56553e77": {
        "ceph_fsid": "e68db21e-a173-4637-9d5f-4bb9476cb93f",
        "device": "/dev/loop0",
        "osd_id": 1,
        "osd_uuid": "2c5e7746-79f4-4498-90ee-654f56553e77",
        "type": "bluestore"
    }
}
2024-08-21 07:22:25.717383 D | exec: Running command: lsblk /dev/loop1 --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME,KNAME,MOUNTPOINT,FSTYPE
2024-08-21 07:22:25.719674 D | sys: lsblk output: "SIZE=\"10737418240\" ROTA=\"0\" RO=\"0\" TYPE=\"loop\" PKNAME=\"\" NAME=\"/dev/loop1\" KNAME=\"/dev/loop1\" MOUNTPOINT=\"\" FSTYPE=\"\""
2024-08-21 07:22:25.719686 I | cephosd: setting device class "ssd" for device "/dev/loop1"
2024-08-21 07:22:25.719694 D | exec: Running command: lsblk /dev/loop0 --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME,KNAME,MOUNTPOINT,FSTYPE
2024-08-21 07:22:25.721669 D | sys: lsblk output: "SIZE=\"10737418240\" ROTA=\"0\" RO=\"0\" TYPE=\"loop\" PKNAME=\"\" NAME=\"/dev/loop0\" KNAME=\"/dev/loop0\" MOUNTPOINT=\"\" FSTYPE=\"\""
2024-08-21 07:22:25.721679 I | cephosd: setting device class "ssd" for device "/dev/loop0"
2024-08-21 07:22:25.721687 I | cephosd: 2 ceph-volume raw osd devices configured on this node
2024-08-21 07:22:25.721699 I | cephosd: devices = [{ID:0 Cluster:ceph UUID:0e0eaea9-7d58-4a7f-b1a8-712f0523e718 DevicePartUUID: DeviceClass:ssd BlockPath:/dev/loop1 MetadataPath: WalPath: SkipLVRelease:true Location:root=default host=minikube LVBackedPV:false CVMode:raw Store:bluestore TopologyAffinity: Encrypted:false ExportService:false NodeName: PVCName:} {ID:1 Cluster:ceph UUID:2c5e7746-79f4-4498-90ee-654f56553e77 DevicePartUUID: DeviceClass:ssd BlockPath:/dev/loop0 MetadataPath: WalPath: SkipLVRelease:true Location:root=default host=minikube LVBackedPV:false CVMode:raw Store:bluestore TopologyAffinity: Encrypted:false ExportService:false NodeName: PVCName:}]
travisn commented 1 month ago

This is a tricky issue for a few reasons:

So the recommended approach is currently to change the device filter or other settings, and purge the disk manually, so Rook doesn't try to add the OSD back. Not ideal, but it's a struggle between automation vs manual, and not ever wanting to lose data accidentally.