Open networkingana opened 9 months ago
It looks like the seed service isn't pointing to any of the pre-existing nodes in the cluster, creating some sort of split brain situation. As the nodes are unable to connect with each other, they're stuck trying to communicate with the old IPs that are stores in the system tables.
These events are the problematic ones:
46m Warning FailedToUpdateEndpoint endpoints/k8ssandra-dc2-service Failed to update endpoint k8ssandra-operator/k8ssandra-dc2-service: Operation cannot be fulfilled on endpoints "k8ssandra-dc2-service": the object has been modified; please apply your changes to the latest version and try again
46m Warning FailedToUpdateEndpoint endpoints/k8ssandra-seed-service
I'm not sure what generated that situation. cass-operator is responsible for placing the seed-node labels that the seed service uses to build the seed list. Did you try to move all 3 nodes at the same time, or one by one? Are there errors in the cass-operator container logs that could help us understand what's happening here? Could you also list the pods with their labels?
I was moving the nodes one by one, that is what is strange to me as well.
1.7079020979412708e+09 INFO reconcile_racks::startOneNodePerRack {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "8b9a9a45-aabb-436e-bc4a-713899ff8cb6", "namespace": "k8ssandra-operator", "datacenterName": "dc2", "clusterName": "k8ssandra"}
1.7079020979412932e+09 INFO calling Management API cluster health - GET /api/v0/probes/cluster {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "8b9a9a45-aabb-436e-bc4a-713899ff8cb6", "pod": "k8ssandra-dc2-rack-1-sts-0"}
1.7079020979413517e+09 INFO client::callNodeMgmtEndpoint {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "8b9a9a45-aabb-436e-bc4a-713899ff8cb6"}
1.7079020979510846e+09 INFO calling Management API cluster health - GET /api/v0/probes/cluster {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "8b9a9a45-aabb-436e-bc4a-713899ff8cb6", "pod": "k8ssandra-dc2-rack-3-sts-0"}
1.7079020979512746e+09 INFO client::callNodeMgmtEndpoint {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "8b9a9a45-aabb-436e-bc4a-713899ff8cb6"}
1.7079020979582424e+09 INFO calling Management API cluster health - GET /api/v0/probes/cluster {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "8b9a9a45-aabb-436e-bc4a-713899ff8cb6", "pod": "k8ssandra-dc2-rack-5-sts-0"}
1.7079020979583006e+09 INFO client::callNodeMgmtEndpoint {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "8b9a9a45-aabb-436e-bc4a-713899ff8cb6"}
1.7079020979648848e+09 INFO calling Management API cluster health - GET /api/v0/probes/cluster {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "8b9a9a45-aabb-436e-bc4a-713899ff8cb6", "pod": "k8ssandra-dc2-rack-2-sts-0"}
1.70790209796494e+09 INFO client::callNodeMgmtEndpoint {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "8b9a9a45-aabb-436e-bc4a-713899ff8cb6"}
1.7079020979712594e+09 INFO calling Management API cluster health - GET /api/v0/probes/cluster {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "8b9a9a45-aabb-436e-bc4a-713899ff8cb6", "pod": "k8ssandra-dc2-rack-4-sts-0"}
1.7079020979714522e+09 INFO client::callNodeMgmtEndpoint {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "8b9a9a45-aabb-436e-bc4a-713899ff8cb6"}
1.707902097979851e+09 INFO reconcile_racks::startAllNodes {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "8b9a9a45-aabb-436e-bc4a-713899ff8cb6", "namespace": "k8ssandra-operator", "datacenterName": "dc2", "clusterName": "k8ssandra"}
1.7079020979800844e+09 INFO reconcile_racks::DecommissionNodes {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "8b9a9a45-aabb-436e-bc4a-713899ff8cb6", "namespace": "k8ssandra-operator", "datacenterName": "dc2", "clusterName": "k8ssandra"}
1.7079020979804478e+09 INFO starting CheckRackPodTemplate() {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "8b9a9a45-aabb-436e-bc4a-713899ff8cb6", "namespace": "k8ssandra-operator", "datacenterName": "dc2", "clusterName": "k8ssandra"}
1.7079020979915037e+09 INFO waiting for upgrade to finish on statefulset {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "8b9a9a45-aabb-436e-bc4a-713899ff8cb6", "namespace": "k8ssandra-operator", "datacenterName": "dc2", "clusterName": "k8ssandra", "statefulset": "k8ssandra-dc2-rack-1-sts", "replicas": 1, "readyReplicas": 0, "currentReplicas": 1, "updatedReplicas": 1}
1.7079020979917388e+09 INFO controllers.CassandraDatacenter Reconcile loop completed {"cassandradatacenter": "k8ssandra-operator/dc2", "requestNamespace": "k8ssandra-operator", "requestName": "dc2", "loopID": "d06b9d75-fa5c-4762-a2ab-45937c69142a", "duration": 0.129688863}
1.7079021079927156e+09 INFO controllers.CassandraDatacenter ======== handler::Reconcile has been called {"cassandradatacenter": "k8ssandra-operator/dc2", "requestNamespace": "k8ssandra-operator", "requestName": "dc2", "loopID": "7bb37329-541c-4bb5-a71c-aa43e8452723"}
1.7079021079927874e+09 INFO handler::CreateReconciliationContext {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "namespace": "k8ssandra-operator"}
1.7079021079930046e+09 INFO handler::calculateReconciliationActions {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "namespace": "k8ssandra-operator", "datacenterName": "dc2", "clusterName": "k8ssandra"}
1.7079021079930294e+09 INFO reconcile_services::ReconcileHeadlessServices {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "namespace": "k8ssandra-operator", "datacenterName": "dc2", "clusterName": "k8ssandra"}
1.7079021079935105e+09 INFO reconcile_endpoints::CheckAdditionalSeedEndpoints {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "namespace": "k8ssandra-operator", "datacenterName": "dc2", "clusterName": "k8ssandra"}
1.7079021079935327e+09 INFO reconcile_racks::calculateRackInformation {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "namespace": "k8ssandra-operator", "datacenterName": "dc2", "clusterName": "k8ssandra"}
1.707902107993544e+09 INFO reconciliationContext::reconcileAllRacks {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "namespace": "k8ssandra-operator", "datacenterName": "dc2", "clusterName": "k8ssandra"}
1.7079021079935536e+09 INFO reconcile_racks::listPods {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "namespace": "k8ssandra-operator", "datacenterName": "dc2", "clusterName": "k8ssandra"}
1.7079021079939945e+09 INFO requesting Cassandra metadata endpoints from Node Management API {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "pod": "k8ssandra-dc2-rack-2-sts-0"}
1.7079021079940143e+09 INFO client::callNodeMgmtEndpoint {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451"}
1.7079021079996274e+09 INFO reconcile_racks::CheckConfigSecret {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "namespace": "k8ssandra-operator", "datacenterName": "dc2", "clusterName": "k8ssandra"}
1.707902107999678e+09 INFO reconcile_racks::CheckRackCreation {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "namespace": "k8ssandra-operator", "datacenterName": "dc2", "clusterName": "k8ssandra"}
1.707902107999685e+09 INFO reconcile_racks::getStatefulSetForRack {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "namespace": "k8ssandra-operator", "datacenterName": "dc2", "clusterName": "k8ssandra"}
1.7079021079998097e+09 INFO reconcile_racks::getStatefulSetForRack {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "namespace": "k8ssandra-operator", "datacenterName": "dc2", "clusterName": "k8ssandra"}
1.7079021079999137e+09 INFO reconcile_racks::getStatefulSetForRack {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "namespace": "k8ssandra-operator", "datacenterName": "dc2", "clusterName": "k8ssandra"}
1.7079021080000153e+09 INFO reconcile_racks::getStatefulSetForRack {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "namespace": "k8ssandra-operator", "datacenterName": "dc2", "clusterName": "k8ssandra"}
1.707902108000122e+09 INFO reconcile_racks::getStatefulSetForRack {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "namespace": "k8ssandra-operator", "datacenterName": "dc2", "clusterName": "k8ssandra"}
1.7079021080002506e+09 INFO reconcile_racks::CheckRackLabels {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "namespace": "k8ssandra-operator", "datacenterName": "dc2", "clusterName": "k8ssandra"}
1.7079021080005913e+09 INFO reconcile_racks::CheckDecommissioningNodes {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "namespace": "k8ssandra-operator", "datacenterName": "dc2", "clusterName": "k8ssandra"}
1.7079021080006142e+09 INFO reconcile_racks::CheckSuperuserSecretCreation {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "namespace": "k8ssandra-operator", "datacenterName": "dc2", "clusterName": "k8ssandra"}
1.7079021080006468e+09 INFO reconcile_racks::CheckInternodeCredentialCreation {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "namespace": "k8ssandra-operator", "datacenterName": "dc2", "clusterName": "k8ssandra"}
1.7079021080007215e+09 INFO starting CheckRackForceUpgrade() {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "namespace": "k8ssandra-operator", "datacenterName": "dc2", "clusterName": "k8ssandra"}
1.7079021080007484e+09 INFO reconcile_racks::CheckRackScale {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "namespace": "k8ssandra-operator", "datacenterName": "dc2", "clusterName": "k8ssandra"}
1.7079021080007598e+09 INFO reconcile_racks::CheckPodsReady {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "namespace": "k8ssandra-operator", "datacenterName": "dc2", "clusterName": "k8ssandra"}
1.7079021080007653e+09 INFO reconcile_racks::findStartedNotReadyNodes {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "namespace": "k8ssandra-operator", "datacenterName": "dc2", "clusterName": "k8ssandra"}
1.7079021080007854e+09 INFO reconcile_racks::deleteStuckNodes {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "namespace": "k8ssandra-operator", "datacenterName": "dc2", "clusterName": "k8ssandra"}
1.707902108000797e+09 INFO reconcile_racks::CheckSeedLabels {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "namespace": "k8ssandra-operator", "datacenterName": "dc2", "clusterName": "k8ssandra"}
1.707902108001134e+09 INFO reconcile_racks::refreshSeeds {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "namespace": "k8ssandra-operator", "datacenterName": "dc2", "clusterName": "k8ssandra"}
1.7079021080011535e+09 INFO calling Management API reload seeds - POST /api/v0/ops/seeds/reload {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "pod": "k8ssandra-dc2-rack-2-sts-0"}
1.7079021080011656e+09 INFO client::callNodeMgmtEndpoint {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451"}
1.7079021080132968e+09 INFO calling Management API reload seeds - POST /api/v0/ops/seeds/reload {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "pod": "k8ssandra-dc2-rack-4-sts-0"}
1.7079021080133405e+09 INFO client::callNodeMgmtEndpoint {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451"}
1.7079021080249355e+09 INFO calling Management API reload seeds - POST /api/v0/ops/seeds/reload {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "pod": "k8ssandra-dc2-rack-5-sts-0"}
1.7079021080249877e+09 INFO client::callNodeMgmtEndpoint {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451"}
1.7079021080340521e+09 INFO calling Management API reload seeds - POST /api/v0/ops/seeds/reload {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "pod": "k8ssandra-dc2-rack-1-sts-0"}
1.7079021080340915e+09 INFO client::callNodeMgmtEndpoint {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451"}
1.707902108049493e+09 INFO calling Management API reload seeds - POST /api/v0/ops/seeds/reload {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "pod": "k8ssandra-dc2-rack-3-sts-0"}
1.7079021080495672e+09 INFO client::callNodeMgmtEndpoint {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451"}
1.7079021080617049e+09 INFO reconcile_racks::findStartingNodes {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "namespace": "k8ssandra-operator", "datacenterName": "dc2", "clusterName": "k8ssandra"}
1.7079021080617561e+09 INFO reconcile_racks::startOneNodePerRack {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "namespace": "k8ssandra-operator", "datacenterName": "dc2", "clusterName": "k8ssandra"}
1.7079021080617948e+09 INFO calling Management API cluster health - GET /api/v0/probes/cluster {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "pod": "k8ssandra-dc2-rack-2-sts-0"}
1.7079021080618205e+09 INFO client::callNodeMgmtEndpoint {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451"}
1.7079021080693684e+09 INFO calling Management API cluster health - GET /api/v0/probes/cluster {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "pod": "k8ssandra-dc2-rack-4-sts-0"}
1.7079021080694141e+09 INFO client::callNodeMgmtEndpoint {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451"}
1.7079021080775194e+09 INFO calling Management API cluster health - GET /api/v0/probes/cluster {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "pod": "k8ssandra-dc2-rack-5-sts-0"}
1.7079021080776033e+09 INFO client::callNodeMgmtEndpoint {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451"}
1.707902108083933e+09 INFO calling Management API cluster health - GET /api/v0/probes/cluster {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "pod": "k8ssandra-dc2-rack-1-sts-0"}
1.7079021080840058e+09 INFO client::callNodeMgmtEndpoint {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451"}
1.707902108093495e+09 INFO calling Management API cluster health - GET /api/v0/probes/cluster {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "pod": "k8ssandra-dc2-rack-3-sts-0"}
1.7079021080935755e+09 INFO client::callNodeMgmtEndpoint {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451"}
1.7079021081019838e+09 INFO reconcile_racks::startAllNodes {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "namespace": "k8ssandra-operator", "datacenterName": "dc2", "clusterName": "k8ssandra"}
1.7079021081020377e+09 INFO reconcile_racks::DecommissionNodes {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "namespace": "k8ssandra-operator", "datacenterName": "dc2", "clusterName": "k8ssandra"}
1.7079021081022608e+09 INFO starting CheckRackPodTemplate() {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "namespace": "k8ssandra-operator", "datacenterName": "dc2", "clusterName": "k8ssandra"}
1.7079021081035924e+09 INFO waiting for upgrade to finish on statefulset {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "cassandraDatacenter": {"name":"dc2","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "dc2", "reconcileID": "865777b7-a29d-4c22-9a5c-7931fd27c451", "namespace": "k8ssandra-operator", "datacenterName": "dc2", "clusterName": "k8ssandra", "statefulset": "k8ssandra-dc2-rack-1-sts", "replicas": 1, "readyReplicas": 0, "currentReplicas": 1, "updatedReplicas": 1}
1.7079021081036375e+09 INFO controllers.CassandraDatacenter Reconcile loop completed {"cassandradatacenter": "k8ssandra-operator/dc2", "requestNamespace": "k8ssandra-operator", "requestName": "dc2", "loopID": "7bb37329-541c-4bb5-a71c-aa43e8452723", "duration": 0.110950298}
I can't see any error in the logs, i tried to grep but just info is present.
This are the pods with its labels
[root@master-node ~]# kubectl get pods --show-labels -n k8ssandra-operator
NAME READY STATUS RESTARTS AGE LABELS
k8ssandra-dc2-rack-1-stargate-deployment-858d87f56f-djsxc 1/1 Running 0 104m app.kubernetes.io/component=stargate,app.kubernetes.io/created-by=stargate-controller,app.kubernetes.io/name=k8ssandra-operator,app.kubernetes.io/part-of=k8ssandra,k8ssandra.io/cluster-name=k8ssandra,k8ssandra.io/cluster-namespace=k8ssandra-operator,k8ssandra.io/stargate-deployment=k8ssandra-dc2-rack-1-stargate-deployment,k8ssandra.io/stargate=k8ssandra-dc2-stargate,pod-template-hash=858d87f56f
k8ssandra-dc2-rack-1-sts-0 2/3 CrashLoopBackOff 18 (102s ago) 110m app.kubernetes.io/created-by=cass-operator,app.kubernetes.io/instance=cassandra-k8ssandra,app.kubernetes.io/managed-by=cass-operator,app.kubernetes.io/name=cassandra,app.kubernetes.io/version=4.0.1,cassandra.datastax.com/cluster=k8ssandra,cassandra.datastax.com/datacenter=dc2,cassandra.datastax.com/node-state=Started,cassandra.datastax.com/rack=rack-1,cassandra.datastax.com/seed-node=true,controller-revision-hash=k8ssandra-dc2-rack-1-sts-6485897b4c,statefulset.kubernetes.io/pod-name=k8ssandra-dc2-rack-1-sts-0
k8ssandra-dc2-rack-2-sts-0 2/3 CrashLoopBackOff 18 (64s ago) 110m app.kubernetes.io/created-by=cass-operator,app.kubernetes.io/instance=cassandra-k8ssandra,app.kubernetes.io/managed-by=cass-operator,app.kubernetes.io/name=cassandra,app.kubernetes.io/version=4.0.1,cassandra.datastax.com/cluster=k8ssandra,cassandra.datastax.com/datacenter=dc2,cassandra.datastax.com/node-state=Started,cassandra.datastax.com/rack=rack-2,cassandra.datastax.com/seed-node=true,controller-revision-hash=k8ssandra-dc2-rack-2-sts-8dc8895d9,statefulset.kubernetes.io/pod-name=k8ssandra-dc2-rack-2-sts-0
k8ssandra-dc2-rack-3-sts-0 3/3 Running 0 110m app.kubernetes.io/created-by=cass-operator,app.kubernetes.io/instance=cassandra-k8ssandra,app.kubernetes.io/managed-by=cass-operator,app.kubernetes.io/name=cassandra,app.kubernetes.io/version=4.0.1,cassandra.datastax.com/cluster=k8ssandra,cassandra.datastax.com/datacenter=dc2,cassandra.datastax.com/node-state=Started,cassandra.datastax.com/rack=rack-3,cassandra.datastax.com/seed-node=true,controller-revision-hash=k8ssandra-dc2-rack-3-sts-65dd59cc55,statefulset.kubernetes.io/pod-name=k8ssandra-dc2-rack-3-sts-0
k8ssandra-dc2-rack-4-sts-0 2/3 CrashLoopBackOff 18 (105s ago) 110m app.kubernetes.io/created-by=cass-operator,app.kubernetes.io/instance=cassandra-k8ssandra,app.kubernetes.io/managed-by=cass-operator,app.kubernetes.io/name=cassandra,app.kubernetes.io/version=4.0.1,cassandra.datastax.com/cluster=k8ssandra,cassandra.datastax.com/datacenter=dc2,cassandra.datastax.com/node-state=Started,cassandra.datastax.com/rack=rack-4,cassandra.datastax.com/seed-node=true,controller-revision-hash=k8ssandra-dc2-rack-4-sts-78879cd87c,statefulset.kubernetes.io/pod-name=k8ssandra-dc2-rack-4-sts-0
k8ssandra-dc2-rack-5-sts-0 3/3 Running 0 110m app.kubernetes.io/created-by=cass-operator,app.kubernetes.io/instance=cassandra-k8ssandra,app.kubernetes.io/managed-by=cass-operator,app.kubernetes.io/name=cassandra,app.kubernetes.io/version=4.0.1,cassandra.datastax.com/cluster=k8ssandra,cassandra.datastax.com/datacenter=dc2,cassandra.datastax.com/node-state=Started,cassandra.datastax.com/rack=rack-5,cassandra.datastax.com/seed-node=true,controller-revision-hash=k8ssandra-dc2-rack-5-sts-d97c86bcf,statefulset.kubernetes.io/pod-name=k8ssandra-dc2-rack-5-sts-0
k8ssandra-dc2-reaper-cd7787c7d-lstzf 1/1 Running 0 103m app.kubernetes.io/component=reaper,app.kubernetes.io/created-by=reaper-controller,app.kubernetes.io/managed-by=k8ssandra-operator,app.kubernetes.io/name=k8ssandra-operator,app.kubernetes.io/part-of=k8ssandra,k8ssandra.io/cluster-name=k8ssandra,k8ssandra.io/cluster-namespace=k8ssandra-operator,k8ssandra.io/reaper=k8ssandra-dc2-reaper,pod-template-hash=cd7787c7d
k8ssandra-operator-5f7b4dfd94-f4n5h 1/1 Running 0 6d19h app.kubernetes.io/instance=k8ssandra-operator,app.kubernetes.io/managed-by=Helm,app.kubernetes.io/name=k8ssandra-operator,app.kubernetes.io/part-of=k8ssandra-k8ssandra-operator-k8ssandra-operator,control-plane=k8ssandra-operator,helm.sh/chart=k8ssandra-operator-0.38.2,pod-template-hash=5f7b4dfd94
k8ssandra-operator-cass-operator-6f7cb8ff67-b5gt5 1/1 Running 2 (6d14h ago) 6d19h app.kubernetes.io/instance=k8ssandra-operator,app.kubernetes.io/managed-by=Helm,app.kubernetes.io/name=cass-operator,app.kubernetes.io/part-of=k8ssandra-k8ssandra-operator-k8ssandra-operator,control-plane=k8ssandra-operator-controller-manager,helm.sh/chart=cass-operator-0.37.2,pod-template-hash=6f7cb8ff67
is there any update for this?
Sorry for the delayed answer, but from looking at the logs, it seems it's actually a Kubernetes issue that's causing the problem. The events you posted Warning FailedToUpdateEndpoint
are coming from Kubernetes' endpoint_controller:
Thus, it does not seem k8ssandra-operator can do much about it. That seed-service's IP list is updated by Kubernetes (we only provide label selector for it), but I'm not sure why it would fail to update those lists. k8ssandra-operator updates and maintains the endpoints for additional-seeds, but that does not seem to be the one that is failing.
What did you do? I have deploymnet of k8ssandra over 5 kubernetes nodes, 3 of them were on CentOS 7 and 2 of them were on Ubuntu 20.04. The plan was to migrate all of the k8ssandra nodes from the CentOS 7 to new Kubernetes nodes on Ubuntu 20.04. The persistent volume of cassandra is created using the local-path-provisioner from rancher. Because my Kubernetes nodes are virtual machine, I moved the hdd device from one VM to another on the virtualization infrastructure, then rescheduled the pod of K8ssandra on that node, the same steps for all of the nodes.
I changed the node affinity in the PV to match the new node, and changed the PVC label to match the new node the same. The PVC were sucessfully bound but the Pods are in the CrashLoopBackOff state. I can see that URGENT MESSAGES on port 7000 are send on wrong IP address.
I tried restarting the k8ssandra cluster using the stopped: true and than stopped: false parameter in the k8ssandraclusters.k8ssandra.io CRD, but the issue is the same.
kubectl get events -n k8ssandra-operator
Here is the repeating log
Here are the pods with wide output, here we can see that the IP address from the above log is not found on the actual pods
Environment Kubernetes version: v1.22.3 OS: Ubuntu 20.04.2 LTS Docker version: docker://20.10.24
K8ssandra Operator version:
docker.io/k8ssandra/k8ssandra-operator:v1.2.1
* Kubernetes version information:Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.2", GitCommit:"8b5a19147530eaac9476b0ab82980b4088bbc1b2", GitTreeState:"clean", BuildDate:"2021-09-15T21:38:50Z", GoVersion:"go1.16.8", Compiler:"gc", Platform:"linux/amd64"}Server Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.0", GitCommit:"c2b5237ccd9c0f1d600d3072634ca66cefdf272f", GitTreeState:"clean", BuildDate:"2021-08-04T17:57:25Z", GoVersion:"go1.16.6", Compiler:"gc", Platform:"linux/amd64"}
Kubernetes cluster kind:
Please let me know if I can provide additional information. Thank you
┆Issue is synchronized with this Jira Story by Unito ┆Issue Number: K8OP-45