percona / percona-helm-charts

Collection of Helm charts for Percona Kubernetes Operators.
https://www.percona.com/software/percona-kubernetes-operators
Other
110 stars 151 forks source link

psmdb-operator crashes when psmdb-db is deployed #338

Open Nickmman opened 4 weeks ago

Nickmman commented 4 weeks ago

I'm using both the psmdb-operator and psmdb-db helm charts. I have deployed the operator (without the db deployment) and it was working fine, without errors/crashes.

However, now that I have deployed the db, the operator enters a crashloop. When the operator starts crashlooping, it causes a complete restart of all the pods from psmdb-db as well.

Logs from operator:

2024-06-12T21:42:34.549Z        INFO    setup   Manager starting up     {"gitCommit": "54e1b18dd9dac8e0ed5929bb2c91318cd6829a48", "gitBranch": "release-1-16-0", "goVersion": "go1.22.3", "os": "linux", "arch": "amd64"}
2024-06-12T21:42:34.565Z        INFO    server version  {"platform": "kubernetes", "version": "v1.28.7+k3s1"}
2024-06-12T21:42:34.570Z        INFO    controller-runtime.metrics      Starting metrics server
2024-06-12T21:42:34.570Z        INFO    starting server {"name": "health probe", "addr": "[::]:8081"}
2024-06-12T21:42:34.570Z        INFO    controller-runtime.metrics      Serving metrics server  {"bindAddress": ":8080", "secure": false}
I0612 21:42:34.570960       1 leaderelection.go:250] attempting to acquire leader lease mongodb/08db0feb.percona.com...
I0612 21:42:53.320941       1 leaderelection.go:260] successfully acquired lease mongodb/08db0feb.percona.com
2024-06-12T21:42:53.321Z        INFO    Starting EventSource    {"controller": "psmdb-controller", "source": "kind source: *v1.PerconaServerMongoDB"}
2024-06-12T21:42:53.321Z        INFO    Starting Controller     {"controller": "psmdb-controller"}
2024-06-12T21:42:53.321Z        INFO    Starting EventSource    {"controller": "psmdbrestore-controller", "source": "kind source: *v1.PerconaServerMongoDBRestore"}
2024-06-12T21:42:53.321Z        INFO    Starting EventSource    {"controller": "psmdbbackup-controller", "source": "kind source: *v1.PerconaServerMongoDBBackup"}
2024-06-12T21:42:53.321Z        INFO    Starting EventSource    {"controller": "psmdbrestore-controller", "source": "kind source: *v1.Pod"}
2024-06-12T21:42:53.321Z        INFO    Starting Controller     {"controller": "psmdbrestore-controller"}
2024-06-12T21:42:53.321Z        INFO    Starting EventSource    {"controller": "psmdbbackup-controller", "source": "kind source: *v1.Pod"}
2024-06-12T21:42:53.321Z        INFO    Starting Controller     {"controller": "psmdbbackup-controller"}
2024-06-12T21:42:53.444Z        INFO    Starting workers        {"controller": "psmdbbackup-controller", "worker count": 1}
2024-06-12T21:42:53.445Z        INFO    Starting workers        {"controller": "psmdb-controller", "worker count": 1}
2024-06-12T21:42:53.445Z        INFO    Starting workers        {"controller": "psmdbrestore-controller", "worker count": 1}
E0612 21:42:53.685207       1 runtime.go:79] Observed a panic: "assignment to entry in nil map" (assignment to entry in nil map)
goroutine 313 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic({0x1f11320, 0x298b1f0})
        /go/pkg/mod/k8s.io/apimachinery@v0.30.0/pkg/util/runtime/runtime.go:75 +0x85
k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0xc000802fc0?})
        /go/pkg/mod/k8s.io/apimachinery@v0.30.0/pkg/util/runtime/runtime.go:49 +0x6b
panic({0x1f11320?, 0x298b1f0?})
        /usr/local/go/src/runtime/panic.go:770 +0x132
github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).setUpdateMongosFirst.func1()
        /go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/smart.go:226 +0xd0
k8s.io/client-go/util/retry.OnError.func1()
        /go/pkg/mod/k8s.io/client-go@v0.30.0/util/retry/util.go:51 +0x30
k8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtection(0x411b9b?)
        /go/pkg/mod/k8s.io/apimachinery@v0.30.0/pkg/util/wait/wait.go:145 +0x3e
k8s.io/apimachinery/pkg/util/wait.ExponentialBackoff({0x989680, 0x4014000000000000, 0x3fb999999999999a, 0x4, 0x0}, 0xc000baaa18)
        /go/pkg/mod/k8s.io/apimachinery@v0.30.0/pkg/util/wait/backoff.go:461 +0x5a
k8s.io/client-go/util/retry.OnError({0x989680, 0x4014000000000000, 0x3fb999999999999a, 0x4, 0x0}, 0x4171ba?, 0x0?)
        /go/pkg/mod/k8s.io/client-go@v0.30.0/util/retry/util.go:50 +0xa5
k8s.io/client-go/util/retry.RetryOnConflict(...)
        /go/pkg/mod/k8s.io/client-go@v0.30.0/util/retry/util.go:104
github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).setUpdateMongosFirst(0x1ef45e0?, {0x29affe0?, 0xc0011ad140?}, 0x6?)
        /go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/smart.go:220 +0xbc
github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).createSSLByCertManager(0xc000b882d0, {0x29affe0, 0xc0011ad140}, 0xc000dcaf08)
        /go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/ssl.go:187 +0x794
github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).reconcileSSL(0xc000b882d0, {0x29affe0, 0xc0011ad140}, 0xc000dcaf08)
        /go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/ssl.go:66 +0x30d
github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).Reconcile(0xc000b882d0, {0x29affe0, 0xc0011ad140}, {{{0xc0006dade8?, 0x5?}, {0xc0006dade0?, 0xc000d25d10?}}})
        /go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/psmdb_controller.go:368 +0x16d0
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0x29b4dc8?, {0x29affe0?, 0xc0011ad140?}, {{{0xc0006dade8?, 0xb?}, {0xc0006dade0?, 0x0?}}})
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.1/pkg/internal/controller/controller.go:114 +0xb7
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc000b820b0, {0x29b0018, 0xc0009c03c0}, {0x1fdf1a0, 0xc000dd27a0})
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.1/pkg/internal/controller/controller.go:311 +0x3bc
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc000b820b0, {0x29b0018, 0xc0009c03c0})
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.1/pkg/internal/controller/controller.go:261 +0x1be
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.1/pkg/internal/controller/controller.go:222 +0x79
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2 in goroutine 141
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.1/pkg/internal/controller/controller.go:218 +0x486
2024-06-12T21:42:53.730Z        INFO    Observed a panic in reconciler: assignment to entry in nil map  {"controller": "psmdb-controller", "object": {"name":"psmdb-db","namespace":"mongodb"}, "namespace": "mongodb", "name": "psmdb-db", "reconcileID": "7676acba-b62f-4d00-a4dc-51c0e17bc27c"}
panic: assignment to entry in nil map [recovered]
        panic: assignment to entry in nil map [recovered]
        panic: assignment to entry in nil map

goroutine 313 [running]:
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile.func1()
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.1/pkg/internal/controller/controller.go:111 +0x1e5
panic({0x1f11320?, 0x298b1f0?})
        /usr/local/go/src/runtime/panic.go:770 +0x132
k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0xc000802fc0?})
        /go/pkg/mod/k8s.io/apimachinery@v0.30.0/pkg/util/runtime/runtime.go:56 +0xcd
panic({0x1f11320?, 0x298b1f0?})
        /usr/local/go/src/runtime/panic.go:770 +0x132
github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).setUpdateMongosFirst.func1()
        /go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/smart.go:226 +0xd0
k8s.io/client-go/util/retry.OnError.func1()
        /go/pkg/mod/k8s.io/client-go@v0.30.0/util/retry/util.go:51 +0x30
k8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtection(0x411b9b?)
        /go/pkg/mod/k8s.io/apimachinery@v0.30.0/pkg/util/wait/wait.go:145 +0x3e
k8s.io/apimachinery/pkg/util/wait.ExponentialBackoff({0x989680, 0x4014000000000000, 0x3fb999999999999a, 0x4, 0x0}, 0xc000efea18)
        /go/pkg/mod/k8s.io/apimachinery@v0.30.0/pkg/util/wait/backoff.go:461 +0x5a
k8s.io/client-go/util/retry.OnError({0x989680, 0x4014000000000000, 0x3fb999999999999a, 0x4, 0x0}, 0x4171ba?, 0x0?)
        /go/pkg/mod/k8s.io/client-go@v0.30.0/util/retry/util.go:50 +0xa5
k8s.io/client-go/util/retry.RetryOnConflict(...)
        /go/pkg/mod/k8s.io/client-go@v0.30.0/util/retry/util.go:104
github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).setUpdateMongosFirst(0x1ef45e0?, {0x29affe0?, 0xc0011ad140?}, 0x6?)
        /go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/smart.go:220 +0xbc
github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).createSSLByCertManager(0xc000b882d0, {0x29affe0, 0xc0011ad140}, 0xc000dcaf08)
        /go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/ssl.go:187 +0x794
github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).reconcileSSL(0xc000b882d0, {0x29affe0, 0xc0011ad140}, 0xc000dcaf08)
        /go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/ssl.go:66 +0x30d
github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).Reconcile(0xc000b882d0, {0x29affe0, 0xc0011ad140}, {{{0xc0006dade8?, 0x5?}, {0xc0006dade0?, 0xc000d25d10?}}})
        /go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/psmdb_controller.go:368 +0x16d0
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0x29b4dc8?, {0x29affe0?, 0xc0011ad140?}, {{{0xc0006dade8?, 0xb?}, {0xc0006dade0?, 0x0?}}})
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.1/pkg/internal/controller/controller.go:114 +0xb7
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc000b820b0, {0x29b0018, 0xc0009c03c0}, {0x1fdf1a0, 0xc000dd27a0})
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.1/pkg/internal/controller/controller.go:311 +0x3bc
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc000b820b0, {0x29b0018, 0xc0009c03c0})
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.1/pkg/internal/controller/controller.go:261 +0x1be
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.1/pkg/internal/controller/controller.go:222 +0x79
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2 in goroutine 141
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.1/pkg/internal/controller/controller.go:218 +0x486

I'm using version 1.16.1 in both charts. My values files are as follows:

perconaMongodb:
  enabled: true
  version: 1.16.1
  values:
    backup:
      enabled: false
      pitr:
        enabled: true
      storages:
        gcp:
          type: s3
          s3:
            credentialsSecret: gcp-backup-credentials
            bucket:  redacted
            region: us
            prefix: dev-onprem/mongodb
            endpointUrl: https://storage.googleapis.com
      tasks:
      - name: daily-gcp-us
        enabled: true
        schedule: "0 0 * * *"
        keep: 3
        storageName: gcp
        compressionType: gzip
    pmm:
      enabled: true
    replsets:
      rs0:
        volumeSpec:
          pvc:
            storageClassName: ceph-block
            resources:
              requests:
                storage: 10Gi
      rs1:
        resources:
          limits:
            cpu: "300m"
            memory: "0.5G"
          requests:
            cpu: "300m"
            memory: "0.5G"
        size: 3
        volumeSpec:
          pvc:
            storageClassName: ceph-block
            resources:
              requests:
                storage: 10Gi
    secrets:
      users: percona-mongodb-credentials
    sharding:
      configrs:
        volumeSpec:
          pvc:
            storageClassName: ceph-block
            resources:
              requests:
                storage: 10Gi
    tls:
      issuerConf:
        name: redacted
        kind: ClusterIssuer
perconaMongodbOperator:
  enabled: true
  version: 1.16.1
  values:
    watchNamespace: "mongodb"

Everything else is using the default values.

Nickmman commented 4 weeks ago

Same issue has been opened over at https://github.com/percona/percona-server-mongodb-operator/issues/1571