[bitnami/wordpress] Wordpress pod restarting more or less every 5-20minutes

EugenMayer commented 1 day ago

Name and Version

bitnami/wordpress 23.1.28

What architecture are you using?

amd64

What steps will reproduce the bug?

Deploy a wordpress site (we have 3 different sites)
let it run for some time, maybe trigger some minor traffic
pod will restart (even when liveness and readyness probes are disable)

Are you using any custom parameters or values?

nothing fancy:

global:
  storageClass: longhorn

service:
  type: ClusterIP

replicaCount: 1
updateStrategy:
  type: Recreate

persistence:
  storageClass: longhorn
  size: 15Gi
  accessModes:
    - ReadWriteOnce

mariadb:
  enabled: false

externalDatabase:
  existingSecret: ${external_db_secret}
  database: ${db_name}
  port: ${db_port}
  host: ${db_host}
  user: ${db_user}

memcached:
  enabled: true

externalCache:
  host: ${memcacheServiceName}

wordpressSkipInstall: ${skipInstall}
wordpressScheme: https
wordpressAutoUpdateLevel: ${autoUpdate}
wordpressUsername: Superadmin
wordpressEmail: account@kontextwork.de
wordpressPassword: "${wpAdminPassword}"
allowEmptyPassword: false
wordpressConfigureCache: false
overrideDatabaseSettings: true

readinessProbe:
  enabled: false

livenessProbe:
  enabled: false

What is the expected behavior?

The pod should not restart

What do you see instead?

pod does restart

Additional information

I seem not to be able to see anything of help when the restart happens. All i see is a couple of requests and then suddently the pod restarts:

10.12.26.32 - - [07/Nov/2024:13:22:38 +0000] "GET /wp-json/wp-site-health/v1/tests/loopback-requests?_locale=user HTTP/1.1" 200 321
10.12.26.32 - - [07/Nov/2024:13:22:38 +0000] "GET / HTTP/1.1" 200 32114
10.12.26.32 - - [07/Nov/2024:13:22:38 +0000] "GET /wp-json/wp-site-health/v1/tests/https-status?_locale=user HTTP/1.1" 200 791
10.12.26.32 - user [07/Nov/2024:13:22:41 +0000] "GET /wp-json/wp-site-health/v1/tests/authorization-header?_locale=user HTTP/1.1" 200 341
10.12.26.32 - - [07/Nov/2024:13:22:41 +0000] "GET / HTTP/1.1" 200 32109
10.12.76.117 - - [07/Nov/2024:13:22:44 +0000] "GET / HTTP/1.1" 200 32109
10.12.76.117 - - [07/Nov/2024:13:22:46 +0000] "GET / HTTP/1.1" 200 32109
10.12.26.32 - - [07/Nov/2024:13:22:41 +0000] "GET /wp-json/wp-site-health/v1/tests/page-cache?_locale=user HTTP/1.1" 200 1547
10.12.26.32 - - [07/Nov/2024:13:22:48 +0000] "POST /wp-admin/admin-ajax.php HTTP/1.1" 200 374
10.12.26.32 - - [07/Nov/2024:13:22:49 +0000] "POST /wp-admin/admin-ajax.php HTTP/1.1" 200 351
10.12.26.32 - - [07/Nov/2024:13:22:49 +0000] "POST /wp-admin/admin-ajax.php HTTP/1.1" 200 513
10.12.26.32 - - [07/Nov/2024:13:22:50 +0000] "POST /wp-admin/admin-ajax.php HTTP/1.1" 200 16
10.12.214.173 - - [07/Nov/2024:13:23:07 +0000] "GET /vergleich-confluence-alternativen/ HTTP/1.1" 200 36836
wordpress 13:23:28.77 INFO  ==> 
wordpress 13:23:28.78 INFO  ==> Welcome to the Bitnami wordpress container
wordpress 13:23:28.81 INFO  ==> Subscribe to project updates by watching https://github.com/bitnami/containers
wordpress 13:23:28.81 INFO  ==> Submit issues and feature requests at https://github.com/bitnami/containers/issues
wordpress 13:23:28.82 INFO  ==> Upgrade to Tanzu Application Catalog for production environments to access custom-configured and pre-packaged software components. Gain enhanced features, including Software Bill of Materials (SBOM), CVE scan result reports, and VEX documents. To learn more, visit https://bitnami.com/enterprise
wordpress 13:23:28.82 INFO  ==> 
wordpress 13:23:28.82 INFO  ==> ** Starting WordPress setup **
realpath: /bitnami/apache/conf: No such file or directory
wordpress 13:23:28.93 INFO  ==> Configuring the HTTP port
wordpress 13:23:28.94 INFO  ==> Configuring the HTTPS port
wordpress 13:23:28.95 INFO  ==> Configuring Apache ServerTokens directive
wordpress 13:23:29.03 INFO  ==> Configuring PHP options
wordpress 13:23:29.04 INFO  ==> Setting PHP expose_php option
wordpress 13:23:29.05 INFO  ==> Setting PHP output_buffering option
wordpress 13:23:29.13 INFO  ==> Validating settings in MYSQL_CLIENT_* env vars
wordpress 13:23:29.64 INFO  ==> Restoring persisted WordPress installation
wordpress 13:23:29.73 INFO  ==> Trying to connect to the database server
wordpress 13:23:29.74 INFO  ==> Overriding the database configuration in wp-config.php with the provided environment variables

wordpress 13:23:33.15 INFO  ==> ** WordPress setup finished! **
wordpress 13:23:33.22 INFO  ==> ** Starting Apache **
[Thu Nov 07 13:23:33.536251 2024] [core:warn] [pid 1:tid 1] AH00098: pid file /opt/bitnami/apache/var/run/httpd.pid overwritten -- Unclean shutdown of previous Apache run?
[Thu Nov 07 13:23:33.542823 2024] [mpm_prefork:notice] [pid 1:tid 1] AH00163: Apache/2.4.62 (Unix) OpenSSL/3.0.14 configured -- resuming normal operations
[Thu Nov 07 13:23:33.542862 2024] [core:notice] [pid 1:tid 1] AH00094: Command line: '/opt/bitnami/apache/bin/httpd -f /opt/bitnami/apache/conf/httpd.conf -D

Is there any way to find out why the pod is restarted in the first place?

The events do not tell anything specific either:

EugenMayer commented 21 hours ago

AFAICS it seems like it is killed by the OOM killer

[1034895.974251] Memory cgroup out of memory: OOM victim 3085690 (httpd) is already exiting. Skip killing the task
[1034895.974253] Memory cgroup out of memory: Killed process 3085741 (httpd) total-vm:312132kB, anon-rss:9032kB, file-rss:4116kB, shmem-rss:76kB, UID:1001 pgtables:168kB oom_score_adj:993

I found an entry for each time the pod has been restart.

It seems like the limit / resources are not properly applied? i used


resources:
  limits:
    cpu: 500m
    ephemeral-storage: 2Gi
    memory: 512Mi
  requests:
    cpu: 400m
    ephemeral-storage: 100Mi
    memory: 512Mi

to rise it to 512MB, but it is killed still on 310Mi (which is the default)?

We have more then enough ram on the nodes, so it is a virtual limit that applies

EugenMayer commented 20 hours ago

Checked the deployment (i upped it to 1024Mi) and i found, for containers

       resources:
            limits:
              cpu: 500m
              ephemeral-storage: 2Gi
              memory: 1Gi
            requests:
              cpu: 500m
              ephemeral-storage: 100Mi
              memory: 1Gi

So the limits have been applied to the deployment and also on the pod

      resources:
        limits:
          cpu: 500m
          ephemeral-storage: 2Gi
          memory: 1Gi
        requests:
          cpu: 500m
          ephemeral-storage: 100Mi
          memory: 1Gi

But still the oomkiller kills at 310MB

[Thu Nov  7 18:52:33 2024] Memory cgroup out of memory: OOM victim 3251759 (httpd) is already exiting. Skip killing the task
[Thu Nov  7 18:52:33 2024] Memory cgroup out of memory: Killed process 3251760 (httpd) total-vm:350456kB, anon-rss:47916kB, file-rss:12668kB, shmem-rss:74452kB, UID:1001 pgtables:448kB oom_score_adj:-997
[Thu Nov  7 18:52:33 2024] Memory cgroup out of memory: Killed process 3251761 (httpd) total-vm:350200kB, anon-rss:47884kB, file-rss:12668kB, shmem-rss:74452kB, UID:1001 pgtables:444kB oom_score_adj:-997
[Thu Nov  7 18:52:33 2024] Memory cgroup out of memory: Killed process 3251763 (httpd) total-vm:346424kB, anon-rss:43728kB, file-rss:12732kB, shmem-rss:72980kB, UID:1001 pgtables:436kB oom_score_adj:-997
[Thu Nov  7 18:52:33 2024] Memory cgroup out of memory: Killed process 3251764 (httpd) total-vm:402344kB, anon-rss:25412kB, file-rss:15584kB, shmem-rss:67992kB, UID:1001 pgtables:404kB oom_score_adj:-997
[Thu Nov  7 18:52:33 2024] Memory cgroup out of memory: Killed process 3251786 (httpd) total-vm:350200kB, anon-rss:46616kB, file-rss:12732kB, shmem-rss:73108kB, UID:1001 pgtables:440kB oom_score_adj:-997
[Thu Nov  7 18:52:33 2024] Memory cgroup out of memory: Killed process 3251789 (httpd) total-vm:329976kB, anon-rss:27056kB, file-rss:12576kB, shmem-rss:69588kB, UID:1001 pgtables:400kB oom_score_adj:-997
[Thu Nov  7 18:52:33 2024] Memory cgroup out of memory: Killed process 3251792 (httpd) total-vm:324724kB, anon-rss:22144kB, file-rss:12384kB, shmem-rss:59540kB, UID:1001 pgtables:364kB oom_score_adj:-997
[Thu Nov  7 18:52:33 2024] Memory cgroup out of memory: Killed process 3251795 (httpd) total-vm:312132kB, anon-rss:9164kB, file-rss:4176kB, shmem-rss:76kB, UID:1001 pgtables:176kB oom_score_adj:-997

javsalgar commented 6 hours ago

Hi,

Could you confirm that when you run kubectl describe pod <wordpress_pod> you can see the established limits? If so, then it seems to me there might be an issue with the Kubernetes cluster, maybe a general virtual limit (not sure how this is achieved, maybe in kubelet?)? In this case, I would check with the cluster administrator.

bitnami / charts