k8ssandra / cass-operator

The DataStax Kubernetes Operator for Apache Cassandra
https://docs.datastax.com/en/cass-operator/doc/cass-operator/cassOperatorGettingStarted.html
Apache License 2.0
181 stars 63 forks source link

cass-operator's `spec.podTemplateSpec.spec.tolerations` is overwritten by `spec.tolerations` without error message #471

Open hoyhbx opened 1 year ago

hoyhbx commented 1 year ago

What happened?

We were specify the tolerations for the podTemplateSpec through the field spec.podTemplateSpec.spec.tolerations but our configuration was overwritten by another field in the CR spec.tolerations without notice or warning.

Did you expect to see something different?

We expect our input to be rejected with clear error message up front, if the desired behavior of the handling of spec.podTemplateSpec.spec.tolerations is to be overwritten by spec.tolerations.

How to reproduce it (as minimally and precisely as possible):

  1. Deploy the cass-operator
  2. Deploy CassandraDB with the following CR yaml
    apiVersion: cassandra.datastax.com/v1beta1
    kind: CassandraDatacenter
    metadata:
    name: test-cluster
    spec:
    clusterName: cluster1
    podTemplateSpec:
    spec:
      containers: []
      tolerations:
      - effect: NoSchedule
        key: node-role.kubernetes.io/control-plane
        operator: Exists

Environment

Shown above in the reproduce section

insert Cass Operator logs relevant to the issue here

Anything else we need to know?: The root cause is at https://github.com/k8ssandra/cass-operator/blob/a6054d271b7034ab06fd97be9472b693ed523e49/pkg/reconciliation/construct_podtemplatespec.go#L635

One possible fix is to add a check in the admission webhook which rejects the CR if the field spec.podTemplateSpec.spec.tolerations is not empty. An alternative is to add a error level log message near https://github.com/k8ssandra/cass-operator/blob/a6054d271b7034ab06fd97be9472b693ed523e49/pkg/reconciliation/construct_podtemplatespec.go#L635 to print error in log to indicate that the toleration is overwritten by another field.

┆Issue is synchronized with this Jira Story by Unito

burmanm commented 1 year ago

The usage of PodTemplateSpec in cass-operator is generally undocumented and not recommended, there are multiple fields which are overwritten or modified without passing them directly to the pods and that's by design. For normal use cases, there should be no need to use PodTemplateSpec.

The controller-runtime does not allow warning validations in webhooks as of yet without a significant rewrite, until https://github.com/kubernetes-sigs/controller-runtime/issues/1896 is implemented.

We would probably add a warning for all PodTemplateSpec usages. As of now, we do not intend to support it or ensure backwards compatibility when using PodTemplateSpec. If you have a real reason to use it which isn't supported otherwise in the CRD, please create a feature request.