Open jorpilo opened 1 year ago
Hello,
This is surprising. 50Mi
should normally be enough for that init container.
A bit of history, we started with 10Mi
and increased it to 20Mi
to fix an issue with CRI-O where 10Mi
was too low. Then we increased it to 50Mi
after users reported having OOMKilled container with 20Mi
on Jan 21th 2020. Since then in 3 years we have never heard of this issue again.
Does the issue constantly appear? What distro of k8s are you using?
The issue is appearing multiple times across different environments.
We are using vanilla k8s
Similar here on ECK version 2.6.1, happened after upgrading from k8s 1.24.6 to 1.25.3. Seems to be the main container get's OOMKilled, the init containers complete.
The same cluster config works fine with ECK 2.6.1 on k8s 1.24.6.
I increased the memory requests to 20Gi, but that didn't help either.
kind: Elasticsearch
metadata:
name: tracing
spec:
version: 7.17.9
http:
service:
spec:
clusterIP: None
nodeSets:
- name: default
count: 3
podTemplate:
spec:
containers:
- name: elasticsearch
resources:
requests:
memory: 20Gi
cpu: 4
limits:
memory: 20Gi
volumeClaimTemplates:
- metadata:
name: elasticsearch-data # Do not change this name unless you set up a volume mount for the data path.
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 16Gi
storageClassName: default
config:
node.store.allow_mmap: false
logging.level: debug `
This issue and my issue https://github.com/elastic/cloud-on-k8s/issues/6542 appear to have the same root cause. The init-container elastic-internal-init-filesystem
uses too much ephemeral storage because it copies the plugins within the pod. Even if the plugins have been already been installed in a custom elasticsearch image. You'll higher chances of success at reproducing this issue if you use large plugins. Feel free to close #6542 as it seems to be a duplicate of this issue.
Additional for anyone interested in a workaround to this issue, you'll to to introduce a persistent volume to house the plugins in instead of using the ephemeral storage.
You'll need to customize the init-container elastic-internal-init-filesystem
.
- name: elastic-internal-init-filesystem
volumeMounts:
- name: elastic-internal-elasticsearch-plugins-local
mountPath: /mnt/elastic-internal/elasticsearch-plugins-local
You then have to add a volumeClaimTemplate named elastic-internal-elasticsearch-plugins-local
- metadata:
name: elastic-internal-elasticsearch-plugins-local
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi #or some reasonable value
storageClassName: <your-storage-class-name>
my YAML formatting is not correct; but it's YAML
@jorpilo Could you please share more information with us such as what version of Elasticsearch, and as much of a full yaml manifest as possible so we can try and replicate? Is there something special about the ES container you're using?
Could you also share either the k8s Events, or the Status sub-object of Elasticsearch showing exactly which pods are getting OOM killed?
Thank you.
@zeezoich your issue appears to be different, as this is an OOM kill, and yours seems to be some ephemeral storage limit (disk related, not memory).
@naemono In our case the ephemeral storage is backed by RAM. That's why I suspected it's the same issue.
Server Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.901"
Looks like I missed a couple of k8s releases 😄 Did you build K8S yourself?
This is due to the cp command in prepare-fs.sh utilizing more than to 50 mi of memory limited to the initContainer
@jorpilo As mentioned by @naemono we need more information. Would it be possible to also share:
invoked oom-killer ...
containerd
, cri-o
... ? What version?@barkbay Just curious on why the copying of plugins is being done from one dir inside the container to another? In my case because the plugins are huge (relatively > 3GB), it causes the eviction of the pod as it exceeds the ephemeral storage limit. So I am just trying to understand this bit and if there is a flag to do away with the copying of plugins. thanks.
Hello. Sorry for the delay kernel messages
mamba invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=979
oom_kill_process.cold+0xb/0x10
oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=f4b8c4eca72c984ac986f74421ca27a0b85a22c6cb3e826d79a6760bd9770311,mems_allowed=0,oom_memcg=/kubepods/burstable/poda858f6f1-ac60-4a58-9ed2-bcccdd287994,task_memcg=/kubepods/burstable/poda858f6f1-ac60-4a58-9ed2-bcccdd287994/f4b8c4eca72c984ac986f74421ca27a0b85a22c6cb3e826d79a6760bd9770311,task=mamba,pid=1641300,uid=185
Memory cgroup out of memory: Killed process 1641300 (mamba) total-vm:4797068kB, anon-rss:4100028kB, file-rss:20916kB, shmem-rss:0kB, UID:185 pgtables:8228kB oom_score_adj:979
[ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
hubble invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=-997
oom_kill_process.cold+0xb/0x10
Memory cgroup out of memory: Killed process 8248 (hubble) total-vm:832316kB, anon-rss:121520kB, file-rss:25932kB, shmem-rss:0kB, UID:0 pgtables:444kB oom_score_adj:-997
oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=86624e560d6b48f099bf2782bc7298cb8dee5f7a68a88bef83775a9d5a09bd74,mems_allowed=0,oom_memcg=/kubepods/burstable/pode6b72518-41c3-4812-a8aa-1cfcb8108fa9/86624e560d6b48f099bf2782bc7298cb8dee5f7a68a88bef83775a9d5a09bd74,task_memcg=/kubepods/burstable/pode6b72518-41c3-4812-a8aa-1cfcb8108fa9/86624e560d6b48f099bf2782bc7298cb8dee5f7a68a88bef83775a9d5a09bd74,task=hubble,pid=8248,uid=0
[ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
oom_kill_process.cold+0xb/0x10
CRI implementation containerd://1.6.12
Bug Report
What did you do? Multiple and repetitive elasticsearch pods are being OOMkilled during initialization. This is a constant and repetitive issue that we are seeing a large number of times a day in multiple environments. This is causing slow start up of pods.
What did you expect to see? Elasticsearch pods starting as expected and not being killed.
What did you see instead? Under which circumstances? Elasticsearch pods are being OOMkilled during initialization due to initContainer
elastic-internal-init-filesystem
running out of memory. This is due to the cp command in prepare-fs.sh utilizing more than to 50 mi of memory limited to the initContainer. Should memory for the initContainer be increased to 100mi?Environment
ECK version: 2.5.1
Kubernetes information: