ocp-power-automation / ocp4-playbooks

Ansible Playbooks for OCP4 on Power
Apache License 2.0
15 stars 51 forks source link

Increase limits for rmc #125

Open lukebrowning opened 3 years ago

lukebrowning commented 3 years ago

OOM occurred on worker-1 node. rmcd process targeted.

Events: Type Reason Age From Message


Warning SystemOOM 8h kubelet System OOM encountered, victim process: rmcd, pid: 6150 Warning SystemOOM 8h kubelet System OOM encountered, victim process: rmcd, pid: 1483512 Warning SystemOOM 8h kubelet System OOM encountered, victim process: rmcd, pid: 1486672 Warning SystemOOM 8h kubelet System OOM encountered, victim process: rmcd, pid: 1486690

I couldn't access worker nodes, but on a master node, the ps -elf indicated rmc had 10000 pages. That's ~640 MBs on mostly idle servers - master nodes. The failure above occurred on worker nodes, where ocs is deployed and I am running ocs-ci performance suite which deploys a bunch of fio pods...

The limit should be increased from 1 Gi to 1.5 or 2 Gi.

lukebrowning commented 3 years ago

It could be that there is a memory leak in rmcd. @manojnkumar reported that after talking Myung Bae.