Closed ron1 closed 4 years ago
The process manager was only used inside Elasticsearch Pods and it has been removed in https://github.com/elastic/cloud-on-k8s/commit/f2b5288f9fbfa2f219f7478259775c47dd221c3a
Does the Namespace Operator require the Global Operator if only the Basic License is being used?
No it does not. But that might change in the future. The idea behind the global operator was to have some cross cutting concerns only running there, which would also allow us to restrict the privileges of the namespace operators much more. You can also deploy the operator in just one process that has all roles if you want (that is also the variant we use in the 'quick start' documentation)
Any ideas what would cause the Global Operator on 0.9.0-RC3 to misbehave on CentOS7/RHEL7 in the same the old 0.8.0 Namespace Operator did? As I mentioned, the 0.9.0-RC3 Namespace Operator no longer seems to misbehave. Is it coincidence the Namespace Operator fix seemed to occur right around the time the Process Manager was removed?
Could you provide more details about the Pod that is OOMKilled ? kubectl get ... -o yaml
Could you also provide the oomkiller logs which are available in the kernel log ?
I have been running the last release candidate of ECK (0.9.0-RC7) for a few hours on Openshift 3.11 and I can't reproduce your issue.
$ uname -a
Linux k8s-michael-master-01 3.10.0-957.21.3.el7.x86_64 #1 SMP Tue Jun 18 16:35:19 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
$ oc get pods --all-namespaces |grep elastic
elastic-namespace-operators elastic-namespace-operator-0 1/1 Running 0 5h
elastic-system elastic-global-operator-0 1/1 Running 0 5h
elastic elasticsearch-sample-es-5tsqghmm79 1/1 Running 0 5h
elastic elasticsearch-sample-es-6qk52mz5jk 1/1 Running 0 5h
elastic elasticsearch-sample-es-dg4vvpm2mr 1/1 Running 0 5h
elastic kibana-sample-kb-97c6b6b8d-lqfd2 1/1 Running 0 5h
The global operator pod failed to deploy due to a template bug here: https://github.com/elastic/cloud-on-k8s/blob/40e85d85e6403847bcaf6f910843040934646fea/operators/config/operator/global/operator.template.yaml#L39
I fixed the problem by making the following change to file operator.template.yaml: Original:
resources:
limits:
cpu: 1
memory: 100Mi
Revised:
resources:
limits:
cpu: 1
memory: 2Gi
out of box operator doesn't work on openshift, same getting OOMKILL, have to change the resources in order to run.
Bug Report
What did you do? Deployed ECK Operator 0.9.0-RC3 Global Operator on OCP 3.11
What did you expect to see? ECK Global Operator Pod with Status "Running"
What did you see instead? Under which circumstances? ECK Global Operator Pod with Status "OOMKilled/CrashLoopBackOff"
Environment OCP 3.11.98
Version information: ECK 0.9.0-RC3
Kubernetes information: OCP 3.11.98
Is it possible that the Global Operator is still using the Process Manager which is problematic on CentOS7/RHEL7 kernels?
BTW, the Namespace Operator OOMKilled issue seems to be fixed in this release. Does the Namespace Operator require the Global Operator if only the Basic License is being used?