Closed burmanm closed 1 week ago
namely the MedusaBackup objects as they were created with the old values and are not actively maintained
I don't think that's really a problem, those labels aren't used as selectors I think.
@Miles-Garnsey @olim7t, please chime in here.
What is missing?
The original behavior was that
cassandra.datastax.com/datacenter
has a value that's identical to theCassandraDatacenter.ObjectMeta.Name
. In PR #475 the behavior was changed to includeCassandraDatacenter.DatacenterName()
which could be either the.Name
or.Spec.DatacenterName
.My proposal is to revert this part of the PR with basically the following fix in cass-operator:
Why is this needed?
This is because right now we're unable to use the labels for querying the parent objects. If we look at a pod/service/StatefulSet/any object when override name is in use it has a label like:
cassandra.datastax.com/datacenter=overrideName
yet using the Kubernetes API this label value is entirely useless. We can't fetch the actualCassandraDatacenter
with this information at all.It also does not necessarily match the value in Cassandra itself either as the value stored in the labels is cleaned up removing certain characters. Thus, both the user and the systems are unable to use these values for anything. And we need this feature, we need to be able to link resources from bottom to top so we can fetch the related resources. Right now only the top to bottom is doable since CassandraDatacenter knows what labels we created to sub resources, but sub resources do not know what labels the parts above (or next to) have.
Reverting this change requires some work also when updating since we do not allow StatefulSet updates automatically without user consent. We can however make an exception and update labels, since this doesn't require a rolling restart of the pods as all labels can be changed in Pods/StatefulSets without rolling update.
For k8ssandra-operator, we are in a more complex scenario. There are already some resources that use the
.Name
(like Prometheus ServiceMonitors) and some that use.DatacenterName()
. This will require some manual upgrade process to enforce the correct format we want everywhere. While most of them should be automatically updated in the upgrade process, there are some that require manual work, namely the MedusaBackup objects as they were created with the old values and are not actively maintained.The k8ssandra-operator also has the
k8ssandra.io/cluster-name
which is different fromcassandra.datastax.com/cluster
. The latter is always using the.Spec.ClusterName
and there's no need to change this behavior. However, for k8ssandra-operator we probably want to enforce the.Name
instead. That gives us both values to use for cluster seeking operations.One raised question was that the update has issues knowing which value we have in the datacenter label when updating from previous versions. This is always problematic no matter if we rollback or not, since there are already versions of cass-operator & k8ssandra-operator adding either value, depending on where you're updating from.
It might feel painful, but I think we should do the right thing here and revert the change to simplify our labels usage as we ourselves can't even keep track of what values are stored where. If we always keep the Kubernetes names, it's very clear. It's also what we internally always process, whatever is in Cassandra is not really relevant to our management processes.
┆Issue is synchronized with this Jira Story by Unito ┆Reviewer: Miles Garnsey ┆Fix Versions: 2024-10,2024-11 ┆Issue Number: CASS-4