jetstack / navigator

Managed Database-as-a-Service (DBaaS) on Kubernetes
Apache License 2.0
271 stars 31 forks source link

Pilot container OOM due to #280 #308

Closed cehoffman closed 6 years ago

cehoffman commented 6 years ago

Is this a BUG REPORT or FEATURE REQUEST?:

/kind bug

What happened: The pilot init container crash loops due to OOM while copying the files to main container volume. This is due to my request in #280 and the current limit on memory of 8Mi.

What you expected to happen: Init container does not crash loop from OOM.

How to reproduce it (as minimally and precisely as possible):

Create a cassandra cluster and watch the init container status.

Anything else we need to know?:

The init container has a chance of actually succeeding, but it does fail quite a bit and seems to have a behavior that if it fails once it will fail multiple times.

Environment:

Ideally the tool used to copy the files between containers would not require having to update the init container resources as the size and number of files changes. cp isn't trying to be memory efficient, but a streaming tool that limited the max amount of memory would be best.

I haven't verified this, but it seems logical that init container resources are not rolled into the running pod resource utilization. If that is true, it would seem that those resources could be anything up to the container resources without changing anything about how the pod is scheduled.

wallrj commented 6 years ago

@cehoffman I also encountered this in #297 and increased the memory limit to 50MiB in https://github.com/jetstack/navigator/pull/298 which was merged on 21 March at 18.32 GMT

I wonder if you were you testing with an older version?

We think that the problem might arise when cp is writing faster than the target container filesystem can handle and the FS is buffering the writes. Similar problems reported here for example with LVM: https://www.redhat.com/archives/linux-lvm/2014-June/msg00023.html

Some interesting details about how cp works here: https://eklitzke.org/efficient-file-copying-on-linux

But we're using busybox cp in this case, so we need to look at that source code too.

How about we ensure that the memory limits are set to at least the combined size of the files being copied.

wallrj commented 6 years ago

It looks like 50MiB should be enough.

/ # ls -lh /pilot /jmx_prometheus_javaagent.jar /jolokia.jar /kubernetes-cassandra.jar 
-rw-r--r--    1 root     root      358.8K Apr  3 11:11 /jmx_prometheus_javaagent.jar
-rw-r--r--    1 root     root      435.3K Apr  3 11:11 /jolokia.jar
-rw-r--r--    1 root     root        9.6K Apr  3 11:11 /kubernetes-cassandra.jar
-rwxrwxr-x    1 root     root       25.2M Apr  3 11:10 /pilot
cehoffman commented 6 years ago

We are pulling the latest image of controller from jetstackexperimental/navigator-controller:latest. Is there somewhere else we should be getting this? This is looking at an existing cassandra cluster config from before the increase in memory was set. I expected that the statefulsets would get updated with these changes, is that not the case?

wallrj commented 6 years ago

@cehoffman Right now, the only change Navigator ever makes to an existing statefulset is to change the replicas field. We're discussing how we can safely update statefulsets and other kubernetes resources between Navigator versions. For now, the work around is to delete the statefulsets (not the pods) and allow Navigator to recreate them and then wait for the statefulset controller to perform a rolling upgrade to get pods with the desired resource limits.

kubectl delete statefulset xyz --cascade=false

(--cascade=false will prevent the pods being deleted)

Also note that we now publish Navigator Docker images to Quay.io rather than Dockerhub.

Let me know if that fixes things for you.

cehoffman commented 6 years ago

@wallrj the statefulset delete resulted in the controller getting stuck it seems.

E0405 16:36:54.667316 1 nodepool.go:55] statefulset name "cass-demo-zone-0" did not contain cluster name "demo"

The first nodepool gets recreated but with 0 replicas and the above error message in controller logs.

wallrj commented 6 years ago

@cehoffman Sorry, that's a separate bug which we've addressed in https://github.com/jetstack/navigator/pull/309. Just waiting for that to be merged....hopefully this morning.

munnerz commented 6 years ago

This is fixed as part of #337