rook / rook

Storage Orchestration for Kubernetes
https://rook.io
Apache License 2.0
11.98k stars 2.64k forks source link

Creating a cluster never succeeds if CSI driver is disabled #14123

Closed travisn closed 3 weeks ago

travisn commented 3 weeks ago

Is this a bug report or feature request?

Deviation from expected behavior: Creating an issue from the discussion on #14089...

The cephcluster reconcile continuously fails if the csi driver is disabled with this setting:

ROOK_CSI_DISABLE_DRIVER: "true"

The reconcile will continuously fail with this error in the operator log:

2024-04-23 21:12:09.518716 E | ceph-cluster-controller: failed to reconcile CephCluster "rook-ceph/my-cluster". 
failed to reconcile cluster "my-cluster": failed to configure local ceph cluster: failed to create cluster: 
failed to start ceph monitors: failed to initialize ceph cluster info: failed to save mons: failed to update csi cluster config: 
waiting for CSI config map to be created: configmaps "rook-ceph-csi-config" not found

Expected behavior: The reconcile should go ahead and create the configmap to allow the reconcile to continue even if the CSI driver is not yet created.

How to reproduce it (minimal and precise):

  1. In operator.yaml, set ROOK_CSI_DISABLE_DRIVER: "true"
  2. kubectl create -f crds.yaml -f common.yaml -f operator.yaml
  3. kubectl create -f cluster-test.yaml
travisn commented 3 weeks ago

Per this comment, seems right for the CSI controller to re-order the implementation so it always ensures the CSI configmap is created as long as it finds a cephcluster CR. @BlaineEXE Any concerns with that?

BlaineEXE commented 3 weeks ago

I think that sounds like the right approach. Even if the CSI drivers are disabled, I don't think that means the reconcile needs to be disabled. And in this case, it can make sure Rook has what it needs to continue work.


In the longer-term view of things, I think it might be nice to find a way for Rook to operate without CSI controller being present at all. This would give us flexibility to have other controllers 'own' the CSI driver reconciliation.

I think we could do that by separating CSI config map into 2 categories: csi-controller-writes:csi-reads and others-write:csi-reads. During reconciliation, it makes sense for Rook to create/update the 'others-write:csi-readsmap, and Rook can freely create it if it doesn't exist. Then, if the CSI controller isn't running (and thecsi-controller-writes:csi-reads` configmap isn't present), Rook isn't stuck waiting forever for it to exist.