rook / rook

Storage Orchestration for Kubernetes
https://rook.io
Apache License 2.0
12k stars 2.65k forks source link

Create a boolean configuration setting to disable CRUSH rule changes for existing pools #13854

Open mohitrajain opened 2 months ago

mohitrajain commented 2 months ago

Is this a bug report or feature request?

What should the feature do: In v1.12.8, a feature was released to update the pool deviceClass: pool: allow updating deviceClass on existing pool.

This feature request proposes an additional flag that allows administrators to opt for the proposed changes from the aforementioned change. Optionally, it could log the recommended state changes for pools upon each evaluation, providing insights into required adjustments. Our clusters are dynamic, with deviceClasses added at later stages. Since the aforementioned feature was not available previously, manual adjustments were made. We aim to avoid disrupting the production cluster by preventing re-mapping of objects due to new crush rules.

What is use case behind this feature: After updating our staging Rook cluster to v1.12.8, which had undergone manual pool modifications in the past, we encountered some unexpected changes:

2024-03-01 11:20:10.879837 I | cephclient: creating a new crush rule for changed deviceClass on crush rule "rbd-hdd-new"
2024-03-01 11:20:10.879851 I | cephclient: updating pool ".mgr" failure domain from "host" to "host" with new crush rule ".mgr_host_hdd"
2024-03-01 11:20:10.879855 I | cephclient: crush rule "rbd-hdd-new" will no longer be used by pool ".mgr"
- A new rule was created even though the hdd deviceClass was specified, due to the bug fix detailed [here](https://github.com/rook/rook/pull/13772 )
---
2024-03-01 11:20:56.454723 I | cephclient: creating a new crush rule for changed deviceClass on crush rule "rbd-ssd-new"
2024-03-01 11:20:56.454740 I | cephclient: updating pool "rbd-ssd" failure domain from "host" to "host" with new crush rule "rbd-ssd_host_ssd"
2024-03-01 11:20:56.454744 I | cephclient: crush rule "rbd-ssd-new" will no longer be used by pool "rbd-ssd"
- A new rule was created even though the hdd deviceClass was specified, due to the bug fix detailed [here](https://github.com/rook/rook/pull/13772)
---
2024-03-01 11:20:28.973957 I | cephclient: creating a new crush rule for changed deviceClass on crush rule "rbd-hdd-new"
2024-03-01 11:20:28.973972 I | cephclient: updating pool "rbd-hdd" failure domain from "host" to "host" with new crush rule "rbd-hdd_host"
2024-03-01 11:20:28.973976 I | cephclient: crush rule "rbd-hdd-new" will no longer be used by pool "rbd-hdd"
- A new rule got created because no deviceClass was mentioned but they were initially set to HDD class manually and it switched to default deviceClass i.e. mix of both
--- other logs have been omitted 

The message described above was observed for all the pools in our cluster. Initially, we had omitted specifying the device class for some pools, which subsequently were affected, as indicated by the logs.

The above crush rule changes led to data being re-mapped across the cluster. It's noteworthy that this issue has been addressed in version 1.13.5 with the implementation of "pool: Skip crush rule update when not needed."

We aim to mitigate any unintended effects on existing pools without a comprehensive understanding of the implications on crush rules. Implementing logging and a toggle flag would facilitate more informed decisions.

Environment: Mix device class environment.

image

subhamkrai commented 2 months ago

@mohitrajain Since the issue is fixed in the latest version as you mentioned, so why not upgrade to the fixed version?

mohitrajain commented 2 months ago

@subhamkrai We need to evaluate the new version in our staging cluster. However, before applying changes to the production cluster, having this feature would be beneficial. It would enable us to preserve manual adjustments on specific pools.

anthonyeleven commented 2 months ago

@subhamkrai It is indeed nice that this has been fixed. One unclear question is how many Rook releases one can jump at a time.

But as to why we ask for this setting -- this particular situation may be addressed, but what about the ones we haven't hit yet? Especially if future Rook releases grow additional actions on existing pools or other resources.

Why had we done manual changes?

OBTW, Mohit is my protege; we work together.

travisn commented 2 months ago

@anthonyeleven So you're hitting this issue in v1.12.8, and you would prefer to get the fix in a v1.12.x release before you upgrade to v1.13? We could likely backport #13772 to v1.12 and make that happen.

anthonyeleven commented 2 months ago

No, not asking for a backport. The 1.13.x fix should suffice if we can skip 1.12.8. The cluster where this matters for us currently runs Rook v1.10.5. Can we go straight to 1.13.x from this release?

The RFE to completely disable modifying existing pools is intended to forestall any future rude awakenings when new functionality is added.

travisn commented 2 months ago

in the upgrade guide it is recommended not to skip minor rook releases. It might work, but I'd recommend testing the upgrade yourself. Our testing does not cover skipping minor releases. Instead, you might consider: v1.10.5 --> v1.11.x --> v1.12.7 --> v1.13.x

anthonyeleven commented 2 months ago

Thanks!
We're testing upgrades in a small lab cluster, but there's only so closely we can align that to prod. v1.10.5 --> v1.11.x --> v1.12.7 --> v1.13.x should be feasible for us.

I didn't want to make assumptions about the .. releases

github-actions[bot] commented 2 weeks ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.