rook / rook

Storage Orchestration for Kubernetes
https://rook.io
Apache License 2.0
12.31k stars 2.69k forks source link

Rook still tries to readd a removed osd, making new osd additions error #7968

Closed sfxworks closed 2 years ago

sfxworks commented 3 years ago

Is this a bug report or feature request?

Deviation from expected behavior:

Rook continues to add deployment for osd that was removed manually per guide at https://github.com/rook/rook/blob/master/Documentation/ceph-osd-mgmt.md#purge-the-osd-manually

Expected behavior:

Rook does not remake an OSD deployment for a disk that was purged from the cluster while removeOSDsIfOutAndSafeToRemove set to true.

How to reproduce it (minimal and precise):

Opt to remove an osd manually Follow the instructions in the doc Restart node/mount it with a random filesystem Note rook remakes osd deployment Note that when adding a new node, osd count starts at the removed number, making keys not match. This currently blocks me from adding new osds to the cluster.

File(s) to submit:

New prepare job for new node: https://pastebin.com/yu3h4SZm Prepares one disk and then errors because of the key issue. Next run it just skips everything https://pastebin.com/3U3xCKSK Leaving the node only partially prepared

[root@pfdc-store-2 ~]# lsblk
NAME                                                                                                                  MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda                                                                                                                     8:0    0   7.3T  0 disk
sdb                                                                                                                     8:16   0   7.3T  0 disk
sdc                                                                                                                     8:32   0   7.3T  0 disk
└─ceph--block--66b2e915--19a2--4474--9534--d87bb199eeb2-osd--block--8597ddbf--e2a6--4fd7--85f0--7227515d10ac          254:0    0   7.3T  0 lvm
sdd                                                                                                                     8:48   0 232.9G  0 disk
├─sdd1                                                                                                                  8:49   0     1G  0 part /boot
├─sdd2                                                                                                                  8:50   0  32.1G  0 part
└─sdd3                                                                                                                  8:51   0 199.8G  0 part /
sde                                                                                                                     8:64   0 953.9G  0 disk
sdf                                                                                                                     8:80   0 232.9G  0 disk
nvme1n1                                                                                                               259:0    0 465.8G  0 disk
└─ceph--block--dbs--b1a6209c--59f4--4d74--b1a6--fe87e43e0b8c-osd--block--db--cdb083ae--97a5--4fb8--b797--262520dc3c03 254:1    0  93.1G  0 lvm
nvme0n1                                                                                                               259:1    0 465.8G  0 disk

Environment:

travisn commented 3 years ago

@sfxworks Since you have useAllDevices: true and useAllNodes: true, Rook will always scan all the devices and attempt to start OSDs on them.

Did you delete the data from the disk after the OSD was removed? If not, Rook will see the disk previously configured and attempt to start the same OSD again. It will remain in a failed state, however, since the OSD auth was removed. So it is recommended to clean or remove the disk after purging the OSD, or else update the device filter to exclude that disk.

sfxworks commented 3 years ago

The disk was removed and was reformatted long before the addition of this new node. I was also referencing that doc in the reproduction steps. The exception would be that it is still on the node, but since rook skips devices that are formatted I expect this behavior even if the device was previously an osd.

travisn commented 3 years ago

@sfxworks After you purge the bad osd, you don't see that osd's auth from the toolbox with ceph auth ls, right?

From this message there is certainly something leftover from the prior OSD

entity osd.4 exists but key does not match
travisn commented 3 years ago

I was not able to repro this in a test cluster. After purging an OSD, a new OSD was able to be created with the same ID. It is expected that Ceph re-uses the same OSD IDs after they are purged. If you see that everything was purged as expected, you please try the latest Rook v1.6 and Ceph v16 releases to see if it helps.

sfxworks commented 3 years ago

I should also mention there was a metadata device associated with the osd. Unfortunately I cannot test an upgrade as I have switched storage providers. The intent was to report everything I could here before the migration.

sethjones commented 3 years ago

I am seeing a similar issue. @travisn directed me to this issue.

I have previously purged/removed osd.5 continues fail to be reused with new disks.

Always giving me:

debug 2021-06-01T18:00:37.831+0000 7f68beecdf40 -1 auth: unable to find a keyring on /var/lib/ceph/osd/ceph-5/keyring: (2) No such file or directory

debug 2021-06-01T18:00:37.831+0000 7f68beecdf40 -1 AuthRegistry(0x5612b4582940) no keyring found at /var/lib/ceph/osd/ceph-5/keyring, disabling cephx

Prep Steps: 1) Verify osd removal in 'ceph auth' and 'ceph osd' via ceph-tools. 2) Removal of osd deployment within kubernetes 3) Disk detection, and creation appears to work normally. However the deployment fails due to above error.

Whether using a new disk, or a manually cleaned disk, my results are always the same for the disk trying to use the osd.5 id.

sethjones commented 3 years ago

My issue has been resolved. Removing the resource constraints I had placed resolved the disk creation to function normally.

github-actions[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.

github-actions[bot] commented 2 years ago

This issue has been automatically closed due to inactivity. Please re-open if this still requires investigation.