IBM / cloud-operators

Provision and bind IBM Cloud services to your Kubernetes cluster in a Kubernetes-native way
Apache License 2.0
42 stars 33 forks source link

Memory Limit too low results in OOMKilled #268

Closed haf-tech closed 2 years ago

haf-tech commented 2 years ago

Installation of the operator in OpenShift is not successful, due the fact that the requested memory limit does not met the real memory consumption

the current config results in restart with OOMKilled signal

resources:
  limits:
    cpu: 100m
    memory: 175Mi
  requests:
    cpu: 100m
    memory: 20Mi

Changing to 255Mi results in a stable version

resources:
  limits:
    cpu: 100m
    memory: 255Mi
  requests:
    cpu: 100m
    memory: 20Mi

Affected version: IBM Cloud Operator, 1.1.0

greglanthier commented 2 years ago

When I've installed ibmcloud-operator I've been using a subscription that looks like this:

apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: ibmcloud-operator
spec:
  channel: stable
  name: ibmcloud-operator
  source: community-operators
  sourceNamespace: openshift-marketplace
  config:
    resources:
      limits:
        cpu: 400m
        memory: 700Mi
      requests:
        cpu: 400m
        memory: 40Mi

This is probably oodles more than the operator itself needs 🤷 .

The increased memory consumption in OpenShift when compared against other cluster types might be related to the number of namespaces present in the cluster itself - especially if your installing the operator with cluster-wide scope.

JohnStarich commented 2 years ago

Thanks for opening @haf-tech. I think this is likely a duplicate of https://github.com/IBM/cloud-operators/issues/199

How many secrets are in your cluster? kubectl get secrets -A | wc -l

JohnStarich commented 2 years ago

If either of you are interested, we’re open to PRs for fine-tuned watches. It looks like they would significantly reduce memory usage.

haf-tech commented 2 years ago

@JohnStarich, have deleted the cluster in the meantime. I tested it today on totally fresh new cluster, but now the default resource limit is fine and enough and yes, #199 could be the same (root) cause

regarding the PR, I have currently no clue about the topic :-D

haf-tech commented 2 years ago

after applying the first Service and Binding, seeing OOMKilled

oc get secrets -A | wc -l
 1258

the above edit (increasing resource limit) helps here

JohnStarich commented 2 years ago

Yeah, looks like the same issue. Let's focus our conversation in #199. @greglanthier you're welcome to take this on, if you're interested.