ceph / ceph-csi-operator

operator that deploys and manages the CephCSI plugins
Apache License 2.0
4 stars 6 forks source link

"Config" in the CRD name is redundant IMO. How about a name without "Config"? #8

Open Madhu-1 opened 1 month ago

Madhu-1 commented 1 month ago
          I think I commented on this previously, but can't find it now or if there was a response... "Config" in the CRD name is redundant IMO. How about a name without "Config"? For example, CephCSIPeerMapping, CephCSIPoolMapping, CephCSIClusterMapping, or even just CephCSIMapping
kind: CephCSIPeerMapping

_Originally posted by @travisn in https://github.com/ceph/ceph-csi-operator/pull/1#discussion_r1617624076_

travisn commented 1 month ago

@Madhu-1 @nb-ohad @BlaineEXE To follow up from the design doc that was merged, I'd like to continue discussing the naming and structure of the CRDs.

CephCSIOperatorConfig --> CephCSIDriverDefaults

99% of the settings in this appear to be for the driver defaults. Why not name it as such? The only setting besides the driver defaults appears to be the operator log level, which could split off elsewhere. Then it is obvious to the user that they add defaults to this CR, while they can override the defaults in CephCSIDriver.

CephCSICephCluster and CephCSIConfig --> CephCSICluster and CephCSIConnection

This CRD needs to be split into more than one CRD. One CRD should be for the connection info and another CRD should be settings for the drivers that manage that cephcluster. This way, Rook can generate the connection info and it won't interfere with the settings that the user wants to apply to the cluster. This is why I think we should include different consumer scenarios (e.g. Rook) in the CephCSI operator design doc, so we can recognize what may be generated vs what should be set by users.

The settings in this CRD also seem to have a very similar purpose to the settings in CephCSIConfig, since in both CRDs they are for a single CephCluster. What is the difference between them from the user's perspective? Maybe there is some implementation difference under the covers about which secrets or configmaps are generated, but the user doesn't care about that.

Seems like we need connection info in a separate CRD from the user settings for a cluster, so we might end up with CRDs such as:

My initial suggestion was going to be CephCSICephConnection and CephCSICephCluster, but then "Ceph" is very redundant. Perhaps it is more precise if the latter "Ceph" is removed from the name, but at the same time perhaps those redundant names are better to make it more obvious that the settings correspond to a CephCluster. I could go either way.

CephCSIConfigMapping --> CephCSIClusterMapping

Based on the suggestion to refactor the previous CRDs, this CRD could be named CephCSIClusterMapping or CephCSICephClusterMapping

BlaineEXE commented 1 month ago

This is again a place where I feel like well-described user stories will be critical for having productive conversation about this work. I think it will help us all to have different scenarios described, who actors are, and what they can do. With that information, we can better envision and vet whether the designs are meeting goals and workflows we have in mind, or whether there are gaps.

@Madhu-1 and @nb-ohad are the ones who have already worked on this design, and I get a sense that they know what the user stories are and who actors are, but for those of us outside, we need these things to be able to understand the design in the context of its intent. Otherwise, we are much in the dark, and I don't think we can provide quality feedback.

Scenario 1: Upstream Rook

In this scenario, the administrator deploys ceph-csi-operator via default manifests that are present in rook/rook. The manifests set up ceph-csi-operator in single-namespace mode. The admin is actor 1.

Rook is actor 2. Rook configures X, Y, and Z resources automatically for the admin.

If the admin wants to override configs, they create X new CR to apply overrides.

Scenario 2: Consumer mode

In this scenario, an admin deploys ceph-csi-operator via OLM. ceph-csi-operator is set up in X mode. The admin is actor 1. ceph-csi-operator is a client for an external Ceph cluster. The admin has created a client operator as well to help automate things. The client operator is actor 2.

The client operator gets connection details for the Ceph cluster via means that are outside ceph-csi-operator's scope. The operator configures X, Y, and Z ceph-csi-operator CRs to connect to the external controller.

If the admin wants to override configs, they take X and Y, or Z action.

Scenario 3: A user creates ceph-csi-operator and manages it themselves

This is probably the simplest case to describe. The user configures X, Y, and Z resources to connect to their Ceph cluster. The user does not have to consider overrides or defaults, because they are the only actor in the system.

Scenario 4: whatever other modes are necessary

etc. etc. etc.

I particularly want to see a scenario description in which multi-namespace is involved.

I think I can extrapolate multi-namespace to all-namespaces by assuming that the user has a fully separate k8s cluster just for storage, to keep things segregated for security.

travisn commented 3 weeks ago

Based on comments and discussion with @Madhu-1 in https://github.com/rook/rook/issues/14260, here are some follow-ups on CRDs and naming to consider...

CephCSICephCluster CRD

CephCSIConfig CRD

CephCSIOperatorConfig CRD

CephCSIConfigMapping

nb-ohad commented 2 weeks ago

CephCSIConfig CRD

The purpose of this CRD is not to define a name, but to decouple the configuration needed to consume Ceph based storage from the storage class itself. This was done for a couple of reasons:

  1. the storage class params are immutable and cannot be updated
  2. Avoid the reputation of identical configuration on multiple storage classes

The linkage between a storage class and this configuration is indeed by name/id but this link does not define the essence of the information residing in this CRD. This is why we chose CephCSIConfig as the name and I don't think the proposed name is a better alternative

CephCSIOperatorConfig

Changing the name of this CRD to CephCSIDriverDefaults does not make sense, mainly because the driver defaults are just a single aspect of the information included in this CR. This CR is intended to offer global/singleton information for the use/configuration of the operator and its reconciliation process. It is true that at this point, the driver default substruct is the majority of the content described by this CR. Still, it is most unlikely that this part will see rapid change when compared to other future operator configuration fields. As an example, we are already discussing additions outside of the default section to provide image sets to the operator to overwrite the default images for CSI components (To generalize and cover one of the use case of CSI orchestration that is available today in ocs-client-operator)

CephCSIConfigMapping

This CR enables you to create an "alias" for a config within the context of a single block pool. It is incorrect to assume a purpose based on multi-cluster environment for this CR. There are other usecase that we considered that are internal to a single cluster. Take for example the case of the need to rename a config for a specific storage class (we anticipate a need for this in the future, especially around cluster migrations. With that in mind, I am still open to renaming as long as the new name is not specifically tied to "mirroring"

travisn commented 2 weeks ago

CephCSIConfig CRD

The purpose of this CRD is not to define a name, but to decouple the configuration needed to consume Ceph based storage from the storage class itself. This was done for a couple of reasons:

  1. the storage class params are immutable and cannot be updated
  2. Avoid the reputation of identical configuration on multiple storage classes

The linkage between a storage class and this configuration is indeed by name/id but this link does not define the essence of the information residing in this CRD. This is why we chose CephCSIConfig as the name and I don't think the proposed name is a better alternative

@Madhu-1 had mentioned that the mount options would be moved to the CephCSICephCluster CRD, so it seems that all that is left in this CRD is to associate the rados namespace or the subvolume group with some name, which can then be added to the storage class.

Agreed the name still doesn't make a lot of sense. IMO the naming would be more clear if this were split into two CRDs, then they could be named CephCSIRadosNamespace and CephCSISubvolumeGroup.

CephCSIOperatorConfig

Changing the name of this CRD to CephCSIDriverDefaults does not make sense, mainly because the driver defaults are just a single aspect of the information included in this CR. This CR is intended to offer global/singleton information for the use/configuration of the operator and its reconciliation process. It is true that at this point, the driver default substruct is the majority of the content described by this CR. Still, it is most unlikely that this part will see rapid change when compared to other future operator configuration fields. As an example, we are already discussing additions outside of the default section to provide image sets to the operator to overwrite the default images for CSI components (To generalize and cover one of the use case of CSI orchestration that is available today in ocs-client-operator)

What about CephCSIDefaultConfig? The new settings you described also seem to fit in this name.

CephCSIConfigMapping

This CR enables you to create an "alias" for a config within the context of a single block pool. It is incorrect to assume a purpose based on multi-cluster environment for this CR. There are other usecase that we considered that are internal to a single cluster. Take for example the case of the need to rename a config for a specific storage class (we anticipate a need for this in the future, especially around cluster migrations. With that in mind, I am still open to renaming as long as the new name is not specifically tied to "mirroring"

What about CephCSIPoolMapping? This is generic enough that the pools are being mapped for some purpose, whether for mirroring or within the same cluster.

nb-ohad commented 2 weeks ago

@Madhu-1 had mentioned that the mount options would be moved to the CephCSICephCluster CRD, so it seems that all that is left in this CRD is to associate the rados namespace or the subvolume group with some name, which can then be added to the storage class.

Agreed the name still doesn't make a lot of sense. IMO the naming would be more clear if this were split into two CRDs, then they could be named CephCSIRadosNamespace and CephCSISubvolumeGroup.

I am not sure we should do that, The fact that the only config we have today is these two pieces of information does not mean it will stay that way for long. We should design our API based on different aspects of CSI and not completely based on the current information available. In This specific case, we identified the need for a CR that represents a central configuration that will be attached to a storage class to allow CSI to understand how that storage class is related to the Ceph cluster. This CR includes any fields that Ceph CSI needs to utilize Ceph correctly. In an Idel world that information should have been embedded in the storage class params but unfortunately, as of the limitations I described in a previous comment it was decided to move it into the CSI config map entry which we are formalizing with this CR

What about CephCSIDefaultConfig? The new settings you described also seem to fit in this name.

The concern of this CR is not defaults configuration, it is the operator configuration, similar to an operator config map (which is a given when you deal with OLM operators), that fact that we consider default driver configuration as part of the operator configuration is just one aspect of this CR. This CR mainly describes the way the operator should behave

What about CephCSIPoolMapping? This is generic enough that the pools are being mapped for some purpose, whether for mirroring or within the same cluster.

Right now the only concern inside the CR is Pool Mapping, but if you look carefully, the mapping is not the top-level key. This was done as we identified that in the future there are going to be other aspects of mapping between CSI configurations. So pool mapping is just one aspect of this API and for that reason, I will argue that CephCSIPoolMapping is not a good name

travisn commented 2 weeks ago

@Madhu-1 had mentioned that the mount options would be moved to the CephCSICephCluster CRD, so it seems that all that is left in this CRD is to associate the rados namespace or the subvolume group with some name, which can then be added to the storage class. Agreed the name still doesn't make a lot of sense. IMO the naming would be more clear if this were split into two CRDs, then they could be named CephCSIRadosNamespace and CephCSISubvolumeGroup.

I am not sure we should do that, The fact that the only config we have today is these two pieces of information does not mean it will stay that way for long. We should design our API based on different aspects of CSI and not completely based on the current information available. In This specific case, we identified the need for a CR that represents a central configuration that will be attached to a storage class to allow CSI to understand how that storage class is related to the Ceph cluster. This CR includes any fields that Ceph CSI needs to utilize Ceph correctly. In an Ideal world that information should have been embedded in the storage class params but unfortunately, as of the limitations I described in a previous comment it was decided to move it into the CSI config map entry which we are formalizing with this CR

What about the CRD name CephResource? Its purposes is to associate some Ceph resource with CSI, so it seems flexible and descriptive enough.

What about CephCSIDefaultConfig? The new settings you described also seem to fit in this name.

The concern of this CR is not defaults configuration, it is the operator configuration, similar to an operator config map (which is a given when you deal with OLM operators), that fact that we consider default driver configuration as part of the operator configuration is just one aspect of this CR. This CR mainly describes the way the operator should behave

Operator is such a generic name, I disagree with using that name in any case. What other alternatives do you suggest? Any idea I suggest is rejected. What I am against is a name so general it is meaningless.

What about CephCSIPoolMapping? This is generic enough that the pools are being mapped for some purpose, whether for mirroring or within the same cluster.

Right now the only concern inside the CR is Pool Mapping, but if you look carefully, the mapping is not the top-level key. This was done as we identified that in the future there are going to be other aspects of mapping between CSI configurations. So pool mapping is just one aspect of this API and for that reason, I will argue that CephCSIPoolMapping is not a good name

If this is intended to be based on the naming of the CephCSIConfig CRD (which above is suggested to name it CephResource, what about CephResourceMapping?

nb-ohad commented 1 week ago

What about the CRD name CephResource? Its purpose is to associate some Ceph resources with CSI, so it seems flexible and descriptive enough.

CephResource implies the content is specific to a single Ceph resource which is not the case here. What about CephSettings or CephConfigurtion where Configuration is a none, as one in many configurations.

Operator is such a generic name, I disagree with using that name in any case. What other alternatives do you suggest? Any idea I suggest is rejected. What I am against is a name so general it is meaningless.

I agree that operator is a very generic name. What I think we should do is follow the convention met by the operator framework where they automatically generate an OpeartorConfig config map (which is of course untyped data). This will bring us back to the OperatorConfig name for the CRD (with a fully qualified name of operatorconfigs.csi.ceph.io