rook / rook

Storage Orchestration for Kubernetes
https://rook.io
Apache License 2.0
12.04k stars 2.66k forks source link

Multisite RGW pools are not reconciled #13880

Open briend opened 3 months ago

briend commented 3 months ago

Rook: 1.13 Ceph: 18.2.1

We noticed that most of our pools are reconciled by Rook, but once you switch to multisite (add another rgw zone) that no longer occurs for any pools associated with RGW. Found that this was explicitly checking and skipping reconciliation for multisite here:

https://github.com/rook/rook/blob/release-1.13/pkg/operator/ceph/object/controller.go#L443-L450

It seems that once the zone is created, Rook will not even recreate the pool if it is missing, nor update its crush rules, etc, as part of the normal reconciliation:

https://github.com/rook/rook/blob/release-1.13/pkg/operator/ceph/object/zone/controller.go#L251-L282

So if for some reason you delete an RGW pool, it will be recreated by the RGWs (not Rook) with some default crush rule, etc, and is never managed by Rook again. Is this an oversight, or maybe this is some extra caution when managing multisite pools?

BlaineEXE commented 3 months ago

I'm not sure Rook officially support migrating from single-site to multisite right now.

@alimaredia would know more.

But I suspect the underlying issue may be that the pool configuration that was specified in your original CephObjectStore might not have been copied to the CephObjectZone when converting to multisite. Or there could be a bug with this part of the current, unsupported migration process.

github-actions[bot] commented 1 month ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.