bitnami / containers

Bitnami container images
https://bitnami.com
Other
3.35k stars 4.81k forks source link

[bitnami/pgpool] Pgpool pod in crashloopbackoff after running for a day #61568

Closed ranjanprasad1996 closed 7 months ago

ranjanprasad1996 commented 8 months ago

Name and Version

bitnami/pgpool:4.5.0-debian-11-r8

What architecture are you using?

None

What steps will reproduce the bug?

Setup:

  1. 1 replica pgpool with 1 replica postgresql
  2. Runs fine for a day and then fails with error failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: container init was OOM-killed (memory limit too low?): unknown

Memory usage is around 150 MB but the limit is 500 MB and the node memory utilization is around 20%

What is the expected behavior?

No response

What do you see instead?

kubectl describe pod pgha-pgpool-cf54985bb-lbxns
Name:             pgha-pgpool-cf54985bb-lbxns
Namespace:        akridata
Priority:         0
Service Account:  default
Node:             ip-172-31-22-150.us-west-1.compute.internal/172.31.22.150
Start Time:       Thu, 15 Feb 2024 14:49:50 +0530
Labels:           app.kubernetes.io/component=pgpool
                  app.kubernetes.io/instance=postgresql-ha
                  app.kubernetes.io/managed-by=Helm
                  app.kubernetes.io/name=postgresql-ha
                  helm.sh/chart=postgresql-ha-9.4.11
                  pod-template-hash=cf54985bb
                  service=pgpool
Annotations:      kubectl.kubernetes.io/restartedAt: 2024-02-08T22:40:52+05:30
Status:           Running
IP:               172.31.29.198
IPs:
  IP:           172.31.29.198
Controlled By:  ReplicaSet/pgha-pgpool-cf54985bb
Containers:
  pgpool:
    Container ID:   containerd://c60a94b7c6941da5c7386a8ba7394996c32ecda0cce34ff71fe87a0ebb4e4b74
    Image:          docker.io/bitnami/pgpool:4.5.0-debian-11-r8
    Image ID:       docker.io/bitnami/pgpool@sha256:23c5a3267561ec57af19759f5eb2d47affbd77f25d931bc4e777dcb73cd145ce
    Port:           5432/TCP
    Host Port:      0/TCP
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       StartError
      Message:      failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: container init was OOM-killed (memory limit too low?): unknown
      Exit Code:    128
      Started:      Thu, 01 Jan 1970 05:30:00 +0530
      Finished:     Fri, 16 Feb 2024 10:09:32 +0530
    Ready:          False
    Restart Count:  190
    Limits:
      cpu:     1
      memory:  500Mi
    Requests:
      cpu:      1
      memory:   300Mi
    Liveness:   exec [/opt/bitnami/scripts/pgpool/healthcheck.sh] delay=30s timeout=10s period=20s #success=1 #failure=5
    Readiness:  exec [bash -ec PGPASSWORD=${PGPOOL_POSTGRES_PASSWORD} psql -U "admin" -d "custom" -h /opt/bitnami/pgpool/tmp -tA -c "SELECT 1" >/dev/null] delay=5s timeout=10s period=20s #success=1 #failure=5
    Environment:
      BITNAMI_DEBUG:                                       false
      PGPOOL_BACKEND_NODES:                                0:pgha-postgresql-0.pgha-postgresql-headless:5432,
      PGPOOL_SR_CHECK_USER:                                repmgr
      PGPOOL_SR_CHECK_PASSWORD:                            <set to the key 'repmgr-password' in secret 'pgha-postgresql'>  Optional: false
      PGPOOL_SR_CHECK_DATABASE:                            postgres
      PGPOOL_ENABLE_LDAP:                                  no
      PGPOOL_POSTGRES_USERNAME:                            admin
      PGPOOL_POSTGRES_PASSWORD:                            <set to the key 'postgresql-password' in secret 'pgha-postgresql'>  Optional: false
      PGPOOL_ADMIN_USERNAME:                               pgpool
      PGPOOL_ADMIN_PASSWORD:                               <set to the key 'admin-password' in secret 'pgha-pgpool'>  Optional: false
      PGPOOL_AUTHENTICATION_METHOD:                        scram-sha-256
      PGPOOL_ENABLE_LOAD_BALANCING:                        yes
      PGPOOL_DISABLE_LOAD_BALANCE_ON_WRITE:                transaction
      PGPOOL_ENABLE_LOG_CONNECTIONS:                       no
      PGPOOL_ENABLE_LOG_HOSTNAME:                          yes
      PGPOOL_ENABLE_LOG_PER_NODE_STATEMENT:                no
      PGPOOL_NUM_INIT_CHILDREN:                            3
      PGPOOL_MAX_POOL:                                     20
      PGPOOL_CHILD_MAX_CONNECTIONS:                        100
      PGPOOL_CHILD_LIFE_TIME:
      PGPOOL_ENABLE_TLS:                                   no
      NEW_RELIC_METADATA_KUBERNETES_CLUSTER_NAME:          k8s-cluster
      NEW_RELIC_METADATA_KUBERNETES_NODE_NAME:              (v1:spec.nodeName)
      NEW_RELIC_METADATA_KUBERNETES_NAMESPACE_NAME:        akridata (v1:metadata.namespace)
      NEW_RELIC_METADATA_KUBERNETES_POD_NAME:              pgha-pgpool-cf54985bb-lbxns (v1:metadata.name)
      NEW_RELIC_METADATA_KUBERNETES_CONTAINER_NAME:        pgpool
      NEW_RELIC_METADATA_KUBERNETES_CONTAINER_IMAGE_NAME:  docker.io/bitnami/pgpool:4.5.0-debian-11-r8
      NEW_RELIC_METADATA_KUBERNETES_DEPLOYMENT_NAME:       pgha-pgpool
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-cxcdq (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  kube-api-access-cxcdq:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason   Age                     From     Message
  ----     ------   ----                    ----     -------
  Warning  BackOff  4m31s (x4441 over 15h)  kubelet  Back-off restarting failed container pgpool in pod pgha-pgpool-cf54985bb-lbxns_akridata(f62400e2-2152-43eb-a78a-941896cff390)

Additional information

No response

javsalgar commented 8 months ago

Hi.

Which is the load you are having in the server? Maybe there was a peak and that caused the restart.

ranjanprasad1996 commented 8 months ago

@javsalgar There was no load on the machine and no peak was observed in memory. Even if it caused a restart it should have recovered after restart but it never recovers until I manually delete the pod. Its forever stuck with the error

      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       StartError
      Message:      failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: container init was OOM-killed (memory limit too low?): unknown
      Exit Code:    128
andresbono commented 8 months ago

Can you try to double the assigned resources to see if it makes any difference? Just for debugging purposes.

github-actions[bot] commented 7 months ago

This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.

github-actions[bot] commented 7 months ago

Due to the lack of activity in the last 5 days since it was marked as "stale", we proceed to close this Issue. Do not hesitate to reopen it later if necessary.