k8ssandra / cass-operator

The DataStax Kubernetes Operator for Apache Cassandra
https://docs.datastax.com/en/cass-operator/doc/cass-operator/cassOperatorGettingStarted.html
Apache License 2.0
187 stars 66 forks source link

Null pointer exception when trying to create a Cassandra Cluster locally #610

Open vitorguidi opened 9 months ago

vitorguidi commented 9 months ago

What happened?

The operator config file was not loaded, when I started the operator with make run

2024-01-21T13:59:22.773-0300 INFO setup Oper config file = {"configfile": ""} 2024-01-21T13:59:22.773-0300 INFO setup Oper config = {"operConfig": {"metrics":{},"health":{},"webhook":{}}}

This made the imageConfigFile parameter be recognized as the empty string, which in turn propagated and triggered a null pointer exception during the cassandra pod creation (as demonstrated in the logs below)

Having said that, is there a recommended way to run the operator locally and properly load the config file?

What did you expect to happen?

I expected the operator to create the Cassandra cluster successfully

How can we reproduce it (as minimally and precisely as possible)?

Clone repository make manifests make install make run k apply -f config/samples/example-cassdc-three-nodes-single-rack.yaml

cass-operator version

commit 609325af0985fc2463b85bf6eb2c9c4d6c63f4f5 on master branch

Kubernetes version

Client Version: v1.28.2 Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3 Server Version: v1.27.3

Method of installation

make run

Anything else we need to know?

2024-01-21T13:34:13.769-0300 INFO controllers.CassandraDatacenter Reconcile loop completed {"cassandradatacenter": "default/dc1", "requestNamespace": "default", "requestName": "dc1", "loopID": "b863ad02-c6f1-4e4e-a786-5dda8e710144", "duration": 29.483785625} 2024-01-21T13:34:13.770-0300 INFO Observed a panic in reconciler: runtime error: invalid memory address or nil pointer dereference {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"dc1","namespace":"default"}, "namespace": "default", "name": "dc1", "reconcileID": "cc59a724-9886-4887-b96f-d59355ea82da"} panic: runtime error: invalid memory address or nil pointer dereference [recovered] panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x2 addr=0x40 pc=0x101595ec0]

goroutine 242 [running]: sigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).Reconcile.func1() /Users/guidi/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:119 +0x1a4 panic({0x1019722a0?, 0x1026ca660?}) /opt/homebrew/Cellar/go/1.21.6/libexec/src/runtime/panic.go:914 +0x218 github.com/k8ssandra/cass-operator/pkg/images.AddDefaultRegistryImagePullSecrets(...) /Users/guidi/projects/cass-operator/pkg/images/images.go:184 github.com/k8ssandra/cass-operator/pkg/reconciliation.buildPodTemplateSpec(0x14000230500, {{0x14000163c89, 0x5}, {0x0, 0x0}, 0x0, 0x0}, 0x2?) /Users/guidi/projects/cass-operator/pkg/reconciliation/construct_podtemplatespec.go:797 +0x220 github.com/k8ssandra/cass-operator/pkg/reconciliation.newStatefulSetForCassandraDatacenter(0x140006ba000, {0x14000163c89, 0x5}, 0x14000230500, 0x1) /Users/guidi/projects/cass-operator/pkg/reconciliation/construct_statefulset.go:115 +0x480 github.com/k8ssandra/cass-operator/pkg/reconciliation.(ReconciliationContext).GetStatefulSetForRack(0x1400024e500, 0x140006b25a0) /Users/guidi/projects/cass-operator/pkg/reconciliation/reconcile_racks.go:1386 +0x108 github.com/k8ssandra/cass-operator/pkg/reconciliation.(ReconciliationContext).CheckRackCreation(0x1400024e500) /Users/guidi/projects/cass-operator/pkg/reconciliation/reconcile_racks.go:141 +0x9c github.com/k8ssandra/cass-operator/pkg/reconciliation.(ReconciliationContext).ReconcileAllRacks(0x1400024e500) /Users/guidi/projects/cass-operator/pkg/reconciliation/reconcile_racks.go:2324 +0x460 github.com/k8ssandra/cass-operator/pkg/reconciliation.(ReconciliationContext).CalculateReconciliationActions(0x1400024e500) /Users/guidi/projects/cass-operator/pkg/reconciliation/handler.go:68 +0xe8 github.com/k8ssandra/cass-operator/internal/controllers/cassandra.(CassandraDatacenterReconciler).Reconcile(0x1400010e870, {0x101b767f0, 0x140006b0330}, {{{0x14000163c60, 0x7}, {0x14000163c5c, 0x3}}}) /Users/guidi/projects/cass-operator/internal/controllers/cassandra/cassandradatacenter_controller.go:147 +0x67c sigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).Reconcile(0x101b767f0?, {0x101b767f0?, 0x140006b0330?}, {{{0x14000163c60?, 0x1018ce360?}, {0x14000163c5c?, 0x1400040ae08?}}}) /Users/guidi/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:122 +0x8c sigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).reconcileHandler(0x14000438b40, {0x101b76828, 0x1400010e7d0}, {0x1019c8ae0?, 0x140003ba760?}) /Users/guidi/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:323 +0x2a0 sigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).processNextWorkItem(0x14000438b40, {0x101b76828, 0x1400010e7d0}) /Users/guidi/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:274 +0x198 sigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).Start.func2.2() /Users/guidi/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:235 +0x74 created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2 in goroutine 85 /Users/guidi/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:231 +0x43c Exiting.

┆Issue is synchronized with this Jira Story by Unito ┆Issue Number: CASS-12

burmanm commented 8 months ago

If you run the operator locally, you need to define where the configFile is (there's a cmdline parameter --config for that).

QuinnBast commented 2 months ago

I am running into the same issue in my Cassandra cluster in an 11 node production cluster, however, I don't think this has to do with the --config parameter.

For me, the Cassandra cluster has been running just fine for the past 4-5 weeks. However, I just did a cluster upgrade which required cordoning and uncordoning my cluster nodes one a time to perform an upgrade. However, once the nodes were all back online, the cass-operator pod is having problems, throwing the same stack trace as above. The other Cassandra nodes and the cluster itself seem to be functional, but the cass-operator pod is having issues.

Manifest:

apiVersion: k8ssandra.io/v1alpha1
kind: K8ssandraCluster
metadata:
  name: cassandra-cluster
spec:
  cassandra:
    serverVersion: "4.0.1"
    datacenters:
      - metadata:
          name: cass1
        size: 3
        storageConfig:
          cassandraDataVolumeClaimSpec:
            storageClassName: topolvm
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 20Gi
        config:
          jvmOptions:
            heapSize: 512M

Stack Trace:

2024-08-26T16:22:31.644Z    INFO    setup   watch namespace configured  {"namespace": "k8ssandra"}
2024-08-26T16:22:31.645Z    INFO    controller-runtime.builder  Registering a mutating webhook  {"GVK": "cassandra.datastax.com/v1beta1, Kind=CassandraDatacenter", "path": "/mutate-cassandra-datastax-com-v1beta1-cassandradatacenter"}
2024-08-26T16:22:31.645Z    INFO    controller-runtime.webhook  Registering webhook {"path": "/mutate-cassandra-datastax-com-v1beta1-cassandradatacenter"}
2024-08-26T16:22:31.645Z    INFO    controller-runtime.builder  Registering a validating webhook    {"GVK": "cassandra.datastax.com/v1beta1, Kind=CassandraDatacenter", "path": "/validate-cassandra-datastax-com-v1beta1-cassandradatacenter"}
2024-08-26T16:22:31.645Z    INFO    controller-runtime.webhook  Registering webhook {"path": "/validate-cassandra-datastax-com-v1beta1-cassandradatacenter"}
2024-08-26T16:22:31.645Z    INFO    setup   starting manager
2024-08-26T16:22:31.645Z    INFO    controller-runtime.metrics  Starting metrics server
2024-08-26T16:22:31.646Z    INFO    controller-runtime.webhook  Starting webhook server
2024-08-26T16:22:31.646Z    INFO    starting server {"kind": "health probe", "addr": "[::]:8081"}
2024-08-26T16:22:31.646Z    INFO    controller-runtime.metrics  Serving metrics server  {"bindAddress": ":8080", "secure": false}
I0826 16:22:31.646913       1 leaderelection.go:250] attempting to acquire leader lease k8ssandra/b569adb7.cassandra.datastax.com...
2024-08-26T16:22:31.647Z    INFO    controller-runtime.certwatcher  Updated current TLS certificate
2024-08-26T16:22:31.647Z    INFO    controller-runtime.webhook  Serving webhook server  {"host": "", "port": 9443}
2024-08-26T16:22:31.647Z    INFO    controller-runtime.certwatcher  Starting certificate watcher
I0826 16:22:48.277086       1 leaderelection.go:260] successfully acquired lease k8ssandra/b569adb7.cassandra.datastax.com
2024-08-26T16:22:48.277Z    INFO    Starting EventSource    {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "source": "kind source: *v1beta1.CassandraDatacenter"}
2024-08-26T16:22:48.277Z    INFO    Starting EventSource    {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "source": "kind source: *v1.StatefulSet"}
2024-08-26T16:22:48.277Z    DEBUG   events  k8ssandra-operator-cass-operator-54cb548c48-s994g_411986a8-ce81-4753-8e71-1ad04b511c33 became leader    {"type": "Normal", "object": {"kind":"Lease","namespace":"k8ssandra","name":"b569adb7.cassandra.datastax.com","uid":"5cbe9a19-2a67-45ac-95a2-0591b58f525d","apiVersion":"coordination.k8s.io/v1","resourceVersion":"54012918"}, "reason": "LeaderElection"}
2024-08-26T16:22:48.277Z    INFO    Starting EventSource    {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "source": "kind source: *v1.PodDisruptionBudget"}
2024-08-26T16:22:48.277Z    INFO    Starting EventSource    {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "source": "kind source: *v1.Service"}
2024-08-26T16:22:48.277Z    INFO    Starting EventSource    {"controller": "cassandratask", "controllerGroup": "control.k8ssandra.io", "controllerKind": "CassandraTask", "source": "kind source: *v1alpha1.CassandraTask"}
2024-08-26T16:22:48.277Z    INFO    Starting Controller {"controller": "cassandratask", "controllerGroup": "control.k8ssandra.io", "controllerKind": "CassandraTask"}
2024-08-26T16:22:48.277Z    INFO    Starting EventSource    {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "source": "kind source: *v1.Secret"}
2024-08-26T16:22:48.277Z    INFO    Starting EventSource    {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "source": "kind source: *v1.Secret"}
2024-08-26T16:22:48.277Z    INFO    Starting Controller {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter"}
2024-08-26T16:22:48.398Z    INFO    Starting workers    {"controller": "cassandratask", "controllerGroup": "control.k8ssandra.io", "controllerKind": "CassandraTask", "worker count": 1}
2024-08-26T16:22:48.398Z    INFO    Starting workers    {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "worker count": 1}
2024-08-26T16:22:48.398Z    INFO    controllers.CassandraDatacenter ======== handler::Reconcile has been called {"cassandradatacenter": {"name":"cass1","namespace":"k8ssandra"}, "requestNamespace": "k8ssandra", "requestName": "cass1", "loopID": "3041aab9-18fe-454f-b4e8-0b2639287109"}
2024-08-26T16:22:48.398Z    INFO    handler::CreateReconciliationContext    {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a", "namespace": "k8ssandra"}
2024-08-26T16:22:48.398Z    INFO    handler::calculateReconciliationActions {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a", "namespace": "k8ssandra", "datacenterName": "cass1", "clusterName": "cassandra-cluster"}
2024-08-26T16:22:48.398Z    INFO    reconcile_services::ReconcileHeadlessServices   {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a", "namespace": "k8ssandra", "datacenterName": "cass1", "clusterName": "cassandra-cluster"}
2024-08-26T16:22:48.399Z    INFO    reconcile_endpoints::CheckAdditionalSeedEndpoints   {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a", "namespace": "k8ssandra", "datacenterName": "cass1", "clusterName": "cassandra-cluster"}
2024-08-26T16:22:48.399Z    INFO    reconcile_racks::calculateRackInformation   {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a", "namespace": "k8ssandra", "datacenterName": "cass1", "clusterName": "cassandra-cluster"}
2024-08-26T16:22:48.399Z    INFO    reconciliationContext::reconcileAllRacks    {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a", "namespace": "k8ssandra", "datacenterName": "cass1", "clusterName": "cassandra-cluster"}
2024-08-26T16:22:48.399Z    INFO    reconcile_racks::listPods   {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a", "namespace": "k8ssandra", "datacenterName": "cass1", "clusterName": "cassandra-cluster"}
2024-08-26T16:22:48.499Z    INFO    requesting Cassandra metadata endpoints from Node Management API    {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a", "pod": "cassandra-cluster-cass1-default-sts-0"}
2024-08-26T16:22:48.499Z    INFO    client::callNodeMgmtEndpoint    {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a"}
2024-08-26T16:22:48.503Z    INFO    Setting pod statuses    {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a", "namespace": "k8ssandra", "datacenterName": "cass1", "clusterName": "cassandra-cluster"}
2024-08-26T16:22:48.504Z    INFO    reconcile_racks::CheckConfigSecret  {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a", "namespace": "k8ssandra", "datacenterName": "cass1", "clusterName": "cassandra-cluster"}
2024-08-26T16:22:48.504Z    INFO    reconcile_racks::CheckRackCreation  {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a", "namespace": "k8ssandra", "datacenterName": "cass1", "clusterName": "cassandra-cluster"}
2024-08-26T16:22:48.504Z    INFO    reconcile_racks::getStatefulSetForRack  {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a", "namespace": "k8ssandra", "datacenterName": "cass1", "clusterName": "cassandra-cluster"}
2024-08-26T16:22:48.504Z    INFO    reconcile_racks::CheckRackLabels    {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a", "namespace": "k8ssandra", "datacenterName": "cass1", "clusterName": "cassandra-cluster"}
2024-08-26T16:22:48.504Z    INFO    reconcile_racks::CheckDecommissioningNodes  {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a", "namespace": "k8ssandra", "datacenterName": "cass1", "clusterName": "cassandra-cluster"}
2024-08-26T16:22:48.504Z    INFO    reconcile_racks::CheckSuperuserSecretCreation   {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a", "namespace": "k8ssandra", "datacenterName": "cass1", "clusterName": "cassandra-cluster"}
2024-08-26T16:22:48.504Z    INFO    reconcile_racks::CheckInternodeCredentialCreation   {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a", "namespace": "k8ssandra", "datacenterName": "cass1", "clusterName": "cassandra-cluster"}
2024-08-26T16:22:48.504Z    INFO    starting CheckRackForceUpgrade()    {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a", "namespace": "k8ssandra", "datacenterName": "cass1", "clusterName": "cassandra-cluster"}
2024-08-26T16:22:48.504Z    INFO    reconcile_racks::CheckRackScale {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a", "namespace": "k8ssandra", "datacenterName": "cass1", "clusterName": "cassandra-cluster"}
2024-08-26T16:22:48.504Z    INFO    reconcile_racks::CheckPodsReady {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a", "namespace": "k8ssandra", "datacenterName": "cass1", "clusterName": "cassandra-cluster"}
2024-08-26T16:22:48.504Z    INFO    reconcile_racks::findStartedNotReadyNodes   {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a", "namespace": "k8ssandra", "datacenterName": "cass1", "clusterName": "cassandra-cluster"}
2024-08-26T16:22:48.504Z    INFO    reconcile_racks::deleteStuckNodes   {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a", "namespace": "k8ssandra", "datacenterName": "cass1", "clusterName": "cassandra-cluster"}
2024-08-26T16:22:48.504Z    INFO    reconcile_racks::CheckSeedLabels    {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a", "namespace": "k8ssandra", "datacenterName": "cass1", "clusterName": "cassandra-cluster"}
2024-08-26T16:22:48.504Z    INFO    reconcile_racks::refreshSeeds   {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a", "namespace": "k8ssandra", "datacenterName": "cass1", "clusterName": "cassandra-cluster"}
2024-08-26T16:22:48.504Z    INFO    calling Management API reload seeds - POST /api/v0/ops/seeds/reload {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a", "pod": "cassandra-cluster-cass1-default-sts-0"}
2024-08-26T16:22:48.504Z    INFO    client::callNodeMgmtEndpoint    {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a"}
2024-08-26T16:22:48.515Z    INFO    calling Management API reload seeds - POST /api/v0/ops/seeds/reload {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a", "pod": "cassandra-cluster-cass1-default-sts-1"}
2024-08-26T16:22:48.515Z    INFO    client::callNodeMgmtEndpoint    {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a"}
2024-08-26T16:22:48.526Z    INFO    calling Management API reload seeds - POST /api/v0/ops/seeds/reload {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a", "pod": "cassandra-cluster-cass1-default-sts-2"}
2024-08-26T16:22:48.526Z    INFO    client::callNodeMgmtEndpoint    {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a"}
2024-08-26T16:22:48.541Z    INFO    reconcile_racks::findStartingNodes  {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a", "namespace": "k8ssandra", "datacenterName": "cass1", "clusterName": "cassandra-cluster"}
2024-08-26T16:22:48.541Z    INFO    reconcile_racks::startOneNodePerRack    {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a", "namespace": "k8ssandra", "datacenterName": "cass1", "clusterName": "cassandra-cluster"}
2024-08-26T16:22:48.541Z    INFO    calling Management API cluster health - GET /api/v0/probes/cluster  {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a", "pod": "cassandra-cluster-cass1-default-sts-0"}
2024-08-26T16:22:48.541Z    INFO    client::callNodeMgmtEndpoint    {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a"}
2024-08-26T16:22:48.545Z    INFO    calling Management API cluster health - GET /api/v0/probes/cluster  {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a", "pod": "cassandra-cluster-cass1-default-sts-1"}
2024-08-26T16:22:48.545Z    INFO    client::callNodeMgmtEndpoint    {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a"}
2024-08-26T16:22:48.549Z    INFO    calling Management API cluster health - GET /api/v0/probes/cluster  {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a", "pod": "cassandra-cluster-cass1-default-sts-2"}
2024-08-26T16:22:48.549Z    INFO    client::callNodeMgmtEndpoint    {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a"}
2024-08-26T16:22:48.553Z    INFO    reconcile_racks::startAllNodes  {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a", "namespace": "k8ssandra", "datacenterName": "cass1", "clusterName": "cassandra-cluster"}
2024-08-26T16:22:48.553Z    INFO    reconcile_racks::DecommissionNodes  {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a", "namespace": "k8ssandra", "datacenterName": "cass1", "clusterName": "cassandra-cluster"}
2024-08-26T16:22:48.553Z    INFO    starting CheckRackPodTemplate() {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a", "namespace": "k8ssandra", "datacenterName": "cass1", "clusterName": "cassandra-cluster"}
2024-08-26T16:22:48.553Z    INFO    controllers.CassandraDatacenter Reconcile loop completed    {"cassandradatacenter": {"name":"cass1","namespace":"k8ssandra"}, "requestNamespace": "k8ssandra", "requestName": "cass1", "loopID": "3041aab9-18fe-454f-b4e8-0b2639287109", "duration": 0.15499156}
2024-08-26T16:22:48.553Z    INFO    Observed a panic in reconciler: runtime error: invalid memory address or nil pointer dereference    {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"cass1","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "cass1", "reconcileID": "3eeef45f-5e8f-4413-94d4-705a5790b60a"}
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
    panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x40 pc=0x1624869]

goroutine 382 [running]:
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile.func1()
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.2/pkg/internal/controller/controller.go:116 +0x1fa
panic({0x17c65a0, 0x283b030})
    /usr/local/go/src/runtime/panic.go:884 +0x213
github.com/k8ssandra/cass-operator/pkg/images.GetConfigBuilderImage(...)
    /workspace/pkg/images/images.go:190
github.com/k8ssandra/cass-operator/pkg/reconciliation.buildInitContainers(0xc00047cf00, {0x19be82c, 0x7}, 0xc000ed2380)
    /workspace/pkg/reconciliation/construct_podtemplatespec.go:413 +0x409
github.com/k8ssandra/cass-operator/pkg/reconciliation.buildPodTemplateSpec(0xc00047cf00, {{0x19be82c, 0x7}, {0x0, 0x0}, 0x0, 0x0}, 0x25?)
    /workspace/pkg/reconciliation/construct_podtemplatespec.go:845 +0x754
github.com/k8ssandra/cass-operator/pkg/reconciliation.newStatefulSetForCassandraDatacenter(0xc0003fb900, {0x19be82c, 0x7}, 0xc00047cf00, 0x3)
    /workspace/pkg/reconciliation/construct_statefulset.go:120 +0x84d
github.com/k8ssandra/cass-operator/pkg/reconciliation.(*ReconciliationContext).CheckRackPodTemplate(0xc00058f400)
    /workspace/pkg/reconciliation/reconcile_racks.go:213 +0x2ac
github.com/k8ssandra/cass-operator/pkg/reconciliation.(*ReconciliationContext).ReconcileAllRacks(0xc00058f400)
    /workspace/pkg/reconciliation/reconcile_racks.go:2397 +0x7f9
github.com/k8ssandra/cass-operator/pkg/reconciliation.(*ReconciliationContext).CalculateReconciliationActions(0xc00058f400)
    /workspace/pkg/reconciliation/handler.go:70 +0x116
github.com/k8ssandra/cass-operator/internal/controllers/cassandra.(*CassandraDatacenterReconciler).Reconcile(0xc000470500, {0x1c46430, 0xc0007aef90}, {{{0xc000f36860, 0x9}, {0xc000f368b0, 0x5}}})
    /workspace/internal/controllers/cassandra/cassandradatacenter_controller.go:146 +0xa65
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0x1c487e0?, {0x1c46430?, 0xc0007aef90?}, {{{0xc000f36860?, 0xb?}, {0xc000f368b0?, 0x0?}}})
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.2/pkg/internal/controller/controller.go:119 +0xc8
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc000362c80, {0x1c46388, 0xc000470460}, {0x1844040?, 0xc0006a6d60?})
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.2/pkg/internal/controller/controller.go:316 +0x3ca
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc000362c80, {0x1c46388, 0xc000470460})
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.2/pkg/internal/controller/controller.go:266 +0x1c5
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.2/pkg/internal/controller/controller.go:227 +0x85
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.2/pkg/internal/controller/controller.go:223 +0x587
burmanm commented 2 months ago

Something in your deployment has caused the cass-operator to lose its access to the imageConfig. There should be a ConfigMap which has suffix -manager-config (the prefix can change depending on the deployment model) which should be mapped to the cass-operator pod.

If you can check that ConfigMap and the Pod definition of cass-operator to see that they still match and are correctly deployed there. It's mounted as /configs/image_config.yaml in the cass-operator pod.

QuinnBast commented 2 months ago

Here's the relevant parts of my pod:

containers:
  - args:
    - --config=/configs/controller_manager_config.yaml
    image: gxrms-testbed-registry:5000/k8ssandra/cass-operator:v1.21.0
    volumeMounts:
    - mountPath: /tmp/k8s-webhook-server/serving-certs
      name: cass-operator-certs-volume
      readOnly: true
    - mountPath: /configs
      name: manager-config
volumes:
- name: cass-operator-certs-volume
  secret:
    defaultMode: 420
    secretName: k8ssandra-operator-cass-operator-webhook-server-cert
- configMap:
    defaultMode: 420
    name: k8ssandra-operator-cass-operator-manager-config
  name: manager-config

And the configmap:

kind: ConfigMap
metadata:
  name: k8ssandra-operator-cass-operator-manager-config
data:
  controller_manager_config.yaml: |
    apiVersion: config.k8ssandra.io/v1beta1
    kind: OperatorConfig
    health:
      healthProbeBindAddress: :8081
    metrics:
      bindAddress: :8080
    webhook:
      port: 9443
    leaderElection:
      leaderElect: true
      resourceName: b569adb7.cassandra.datastax.com
    disableWebhooks: false
    imageConfigFile: /configs/image_config.yaml
  image_config.yaml: |-
    apiVersion: config.k8ssandra.io/v1beta1
    kind: ImageConfig
    metadata:
      name: image-config
    imageRegistry: "testbed-registry:5000"

I set my startup command to be: ["sh", "-c", "tail -f /dev/null"] and I was able to browse around in the container's files. The file is there:

$ ls /configs/
controller_manager_config.yaml  image_config.yaml
$ cat /configs/image_config.yaml 
apiVersion: config.k8ssandra.io/v1beta1
kind: ImageConfig
metadata:
  name: image-config
imageRegistry: "gxrms-testbed-registry:5000"