kubebb / core

A declarative component lifecycle management platform
https://kubebb.github.io/website
Apache License 2.0
8 stars 9 forks source link

fix OOMKilled #386

Closed Abirdcfly closed 2 months ago

Abirdcfly commented 1 year ago

kubebb-core container always occasionally uses more memory than we expected, which should be fixed. The latest one is https://github.com/kubebb/core/actions/runs/6664923119/job/18113595285?pr=385

Name:             kubebb-controller-manager-5679668d9f-pfcnj
Namespace:        kubebb-system
Priority:         0
Service Account:  kubebb-controller-manager
Node:             kubebb-core-control-plane/172.18.0.2
Start Time:       Fri, 27 Oct 2023 08:47:10 +0000
Labels:           control-plane=controller-manager
                  pod-template-hash=5679668d9f
Annotations:      kubectl.kubernetes.io/default-container: manager
Status:           Running
IP:               10.244.0.9
IPs:
  IP:           10.244.0.9
Controlled By:  ReplicaSet/kubebb-controller-manager-5679668d9f
Containers:
  manager:
    Container ID:  containerd://4a9083afa58edb65d1595570224b05325781c5b9ce1c8f2fd013ea3c8da0da08
    Image:         kubebb/core:example-e2e
    Image ID:      docker.io/library/import-2023-10-27@sha256:38f899cf1378ab99fbab07522ad327f638438b8be725a9b39c1fe47e1f8f4d6a
    Port:          9443/TCP
    Host Port:     0/TCP
    Command:
      /manager
    Args:
      --config=controller_manager_config.yaml
    State:          Running
      Started:      Fri, 27 Oct 2023 08:49:49 +0000
    Last State:     Terminated
      Reason:       OOMKilled
      Exit Code:    137
      Started:      Fri, 27 Oct 2023 08:47:11 +0000
      Finished:     Fri, 27 Oct 2023 08:49:48 +0000
    Ready:          True
    Restart Count:  1
    Limits:
      cpu:     5
      memory:  1536Mi
    Requests:
      cpu:      10m
      memory:   64Mi
    Liveness:   http-get http://:8081/healthz delay=15s timeout=1s period=20s #success=1 #failure=3
    Readiness:  http-get http://:8081/readyz delay=5s timeout=1s period=10s #success=1 #failure=3
    Environment:
      POD_NAME:       kubebb-controller-manager-5679668d9f-pfcnj (v1:metadata.name)
      POD_NAMESPACE:  kubebb-system (v1:metadata.namespace)
    Mounts:
      /controller_manager_config.yaml from manager-config (rw,path="controller_manager_config.yaml")
      /tmp/k8s-webhook-server/serving-certs from cert (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-8phnp (ro)
  kube-rbac-proxy:
    Container ID:  containerd://69bfc837635456cad420b279e1425aff6b4f225fb55466163c272634d0ceae9d
    Image:         gcr.io/kubebuilder/kube-rbac-proxy:v0.13.0
    Image ID:      gcr.io/kubebuilder/kube-rbac-proxy@sha256:d99a8d144816b951a67648c12c0b988936ccd25cf3754f3cd85ab8c01592248f
    Port:          8443/TCP
    Host Port:     0/TCP
    Args:
      --secure-listen-address=0.0.0.0:8443
      --upstream=http://127.0.0.1:8080/
      --logtostderr=true
      --v=0
    State:          Running
      Started:      Fri, 27 Oct 2023 08:47:13 +0000
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     500m
      memory:  128Mi
    Requests:
      cpu:        5m
      memory:     64Mi
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-8phnp (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  cert:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  webhook-server-cert
    Optional:    false
  manager-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      kubebb-manager-config
    Optional:  false
  kube-api-access-8phnp:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason       Age                From               Message
  ----     ------       ----               ----               -------
  Normal   Scheduled    13m                default-scheduler  Successfully assigned kubebb-system/kubebb-controller-manager-5679668d9f-pfcnj to kubebb-core-control-plane
  Warning  FailedMount  13m                kubelet            MountVolume.SetUp failed for volume "cert" : secret "webhook-server-cert" not found
  Normal   Pulling      13m                kubelet            Pulling image "gcr.io/kubebuilder/kube-rbac-proxy:v0.13.0"
  Normal   Pulled       13m                kubelet            Successfully pulled image "gcr.io/kubebuilder/kube-rbac-proxy:v0.13.0" in 1.768700013s
  Normal   Created      13m                kubelet            Created container kube-rbac-proxy
  Normal   Started      13m                kubelet            Started container kube-rbac-proxy
  Normal   Pulled       10m (x2 over 13m)  kubelet            Container image "kubebb/core:example-e2e" already present on machine
  Normal   Created      10m (x2 over 13m)  kubelet            Created container manager
  Normal   Started      10m (x2 over 13m)  kubelet            Started container manager

and https://github.com/kubebb/core/actions/runs/6668891253/job/18125638885

Abirdcfly commented 2 months ago

Inactive for a long time, close the issue