Open happysalada opened 2 days ago
Could you provide some more details? Which pods are OOM killed? Could you provide an example of the used memory for the processes? Are the processes OOM killed by Linux or is the OOM triggered by FDB itself? Can you share your FoundationDBCluster
spec?
The oomkilled pds are the storage ones as far as i can tell. Named NAME-storage-NNNN The oomkiller seems to be triggered from kube after the pods overstep their memory allocation. The way i track that is with the kube metrics container_oom_events_total from kubernetes. Here is what i customized from the operator
processCounts: Stateless: 10
For foundationdb im requesting 1 cpu and 4gb of memory.
Im using 7.3.43
And i use useDNSInClusterFile but i dont think it should matter.
I remember reading that 4 gb was enough, but i guess this is the problem then ? Maybe in the config i should set the max memory ? Or just increase every pod to 8gb
I run a cluster on baremetal with default settings and the processes are known to go up to 12gb sometimes . since the baremetal has way more memory it doesnt cause any problems.
What happened?
Pods keep on getting oomkilled
What did you expect to happen?
Pod don't get oomkilled
How can we reproduce it (as minimally and precisely as possible)?
I've deployed the minimal example giving each pod 8Gb of memory. Throwing any load at the cluster, you start to get pod oomkilled
Anything else we need to know?
Is there a setting that I'm missing perhaps ? The memory setting doesn't seem to be respected and the pods keep on getting killed. I'm not sure what my setup is missing
FDB Kubernetes operator
Kubernetes version
Cloud provider