mindwm / mindwm-gitops

7 stars 5 forks source link

healthcheck timeout too low for redpanda #62

Closed omgbebebe closed 2 months ago

omgbebebe commented 2 months ago
redpanda@neo4j-cdc-0:/$ time rpk cluster health
CLUSTER HEALTH OVERVIEW
=======================
Healthy:                          true
Unhealthy reasons:                []
Controller ID:                    0
All nodes:                        [0]
Nodes down:                       []
Leaderless partitions (0):        []
Under-replicated partitions (0):  []

real    0m5.929s
user    0m0.160s
sys 0m0.102s

current timeouts for the checks is delay=1s timeout=1s period=10s. Need to increase up to 10sec or so.

metacoma commented 2 months ago

here is how the livenessProbe definition looks for the Redpanda cluster StatefulSet

      livenessProbe:
          exec:
            command:
            - /bin/sh
            - -c
            - curl --silent --fail -k -m 5  "http://${SERVICE_NAME}.neo4j-cdc.redpanda.svc.cluster.local.:9644/v1/status/ready"
          failureThreshold: 3
          initialDelaySeconds: 10
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1

The Redpanda cluster is managed by the Redpanda operator. Currently, there is no support for configuring livenessProbe timeoutSeconds parameter through the Redpanda cluster CRD.

For more details, see here: https://github.com/redpanda-data/redpanda-operator/blob/3ec4cf0eefb36694da06d1a70a23ee60c0495424/src/go/k8s/api/redpanda/v1alpha2/redpanda_clusterspec_types.go#L671 https://docs.redpanda.com/current/reference/k-crd/

metacoma commented 2 months ago

for tracking: https://github.com/redpanda-data/redpanda-operator/issues/185

metacoma commented 2 months ago

closed by #85