Orange-OpenSource / casskop

This Kubernetes operator automates the Cassandra operations such as deploying a new rack aware cluster, adding/removing nodes, configuring the C* and JVM parameters, upgrading JVM and C* versions, and many more...
https://orange-opensource.github.io/casskop/
Apache License 2.0
183 stars 54 forks source link

Fix operator & pods restart when scaling up DC with autoUpdateSeedList set to true #340

Closed PERES-Richard closed 3 years ago

PERES-Richard commented 3 years ago

Signed-off-by: Richard Peres richard.peres@orange.com

Q A
Bug fix [x]
License Apache 2.0

What's in this PR?

A small fix to a bug that appears when scaling up a CassandraCluster by adding a new DC. The bug causes other rack's pods to restart as well as operator's pod only when autoUpdateSeedList is set to true in cluster config.

Checklist

PERES-Richard commented 3 years ago

Maybe we will have to add a AssertCommand test step to ensure there were 0 restart from operator and others pods with something like [[kgp -o json | jq '.items[].status.containerStatuses[].restartCount] -eq 0]'

PERES-Richard commented 3 years ago

k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0) casskop/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:48 +0x82 panic(0x15d9800, 0x25a2050) /usr/local/go/src/runtime/panic.go:969 +0x166 github.com/Orange-OpenSource/casskop/pkg/controller/cassandracluster.FlipCassandraClusterUpdateSeedListStatus(0xc000100700, 0xc0002b8960) casskop/pkg/controller/cassandracluster/reconcile.go:642 +0x192 github.com/Orange-OpenSource/casskop/pkg/controller/cassandracluster.(*ReconcileCassandraCluster).Reconcile(0xc000328c90, 0xc00041e080, 0x7, 0xc00041e070, 0xd, 0x0, 0x0, 0x0, 0x0) casskop/pkg/controller/cassandracluster/cassandracluster_controller.go:150 +0x46c sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc0001b29c0, 0x1637880, 0xc0006ffda0, 0x1733c00) casskop/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:256 +0x161 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc0001b29c0, 0x203000) casskop/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:232 +0xae sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker(0xc0001b29c0) casskop/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:211 +0x2b k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc000974140) casskop/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155 +0x5f k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc000974140, 0x19df000, 0xc00096a240, 0xc00011a001, 0xc000934180) casskop/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156 +0xa3 k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc000974140, 0x3b9aca00, 0x0, 0x185a101, 0xc000934180) casskop/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0xe2 k8s.io/apimachinery/pkg/util/wait.Until(0xc000974140, 0x3b9aca00, 0xc000934180) casskop/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:90 +0x4d created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1 casskop/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:193 +0x305 panic: runtime error: invalid memory address or nil pointer dereference [recovered] panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x14123d2]

cscetbon commented 3 years ago

@PERES-Richard please test and review this PR