increase default VM RAM requirement to match AWS infrastructure requirements

jhutar commented 5 years ago

Proposing this after discussion with @mcornea and after debugging issues caused by OOM on a host with 32GB RAM. Would 16GB be more reasonable default allowing higher installation success rate?

russellb commented 5 years ago

I think it makes sense to default to the same limits used on AWS, for our own sanity.

Based on my last AWS deployment, it looks like 8 GB (m4.large) for bootstrap, and 16 GB (m4.xlarge) for masters.

This will make it very difficult to use a 32 GB host, except for a single master deployment.

64 GB would work, but only for a 3 master deployment without workers. I suppose we could automate creating a worker VM after the bootstrap VM is destroyed.

128 GB is ideal, but I know most folks don't have machines that big available for this work.

russellb commented 5 years ago

https://github.com/openshift/installer/blob/master/docs/user/aws/limits.md#instance-limits

https://aws.amazon.com/ec2/instance-types/

stbenjam commented 5 years ago

I agree. I've been seeing OOM killed pods with 8GB masters. My deployment with 16GB masters from yesterday is still online - longest successfully running deployment I've had lately.

We've made some assumptions about there being 3 masters for metalkube, so unfortunately that means you need really powerful hardware. But it's rather pointless to have these low limits if it results in instability and a good percentage of deployments failing.

I'm working on making changes to accommodate only a single master which should make a 32GB box acceptable for development (KNIDEPLOY-429).

hardys commented 5 years ago

Sounds reasonable, I've been holding off though because I know this will impact existing users on 32G boxes until we get the single-master thing worked out.

Yesterday I did push https://github.com/openshift-metalkube/dev-scripts/pull/339 too which should enable easier customization, e.g for single master/worker or custom memory/cpu sizes. We should optimize the default path to "just work" as much as possible though, even if it means by default a larger test host is required.

hardys commented 5 years ago

Ok that patch increases the master nodes to 16G, and reduces the worker count to 1.

Re the bootstrap VM, in my environment that only gets 4G so I guess we need to confirm if that needs increasing as well, AFAICT it's working OK though

dantrainor commented 5 years ago

I'm looking at upstream memory requirements[0] and they state 16G for masters. Maybe we've just been doing it wrong the whole time, attempting to use 8G.

[0] https://docs.okd.io/latest/install/prerequisites.html

dantrainor commented 5 years ago

Just to confirm, I ran vmstat in the background and can confirm we do indeed basically run out of memory with a default_memory of 8192 specified in tripleo-quickstart-config/metalkube-nodes.yml:

procs -----------me r b swpd 10 0 27648 8799540 7 0 27648 8724768 2 0 27648 8674748 5 0 27648 8648832 5 0 27648 8608160 10 0 27648 8465268 7 0 27648 8246360 7 0 27648 7896492 8 0 27648 7575840 7 0 27648 7226444 5 0 27648 6792260 7 1 27648 4486760 1 5 27648 2989744 3 2 27648 2451676 1 4 27648 1927412 1 4 27648 1571700 1 5 27648 1012480 1 5 27648 664920 1 5 27648 570924 3 5 27904 369128 1 5 28160 356136 1 6 30464 236668 2 5 31232 240608 3 4 33280 239268 2 4 35328 199016 4 3 35840 198704 4 4 37376 199356 9 3 37632 231044 4 5 38656 199052 2 4 39168 221864 mory---------- ---swap-- -----io---- -system-- ------cpu----- free buff cache si so bi bo in cs us sy id wa st 0 11148604 0 0 0 47 15536 11527 73 15 12 0 0 0 11149784 0 0 0 25 15718 10032 69 12 19 0 0 0 11150948 0 0 0 27 8680 11505 32 10 58 0 0 0 11150948 0 0 0 0 6760 10718 27 11 63 0 0 0 11150952 0 0 0 0 9740 11945 47 17 36 0 0 0 11151036 0 0 76 435 12989 14346 53 24 22 0 0 0 11151276 0 0 0 564 13998 17058 56 31 13 0 0 0 11151176 0 0 0 2542 15612 18893 58 32 10 0 0 0 11151584 0 0 0 430 15281 19264 48 34 18 0 0 0 11151648 0 0 0 10 15435 16718 56 33 11 0 0 0 11360376 0 0 0 95 15148 31816 40 31 29 0 0 0 13602468 0 0 0 167596 13103 31558 32 39 27 1 0 0 15083124 0 0 0 506504 11338 31921 26 29 32 13 0 0 15617612 0 0 0 519680 4972 16460 10 15 39 37 0 0 16141104 0 0 0 508956 4963 12990 9 15 45 31 0 0 16496072 0 0 0 522276 3985 4689 5 11 44 39 0 0 17054256 0 0 0 506664 4239 6937 6 13 37 44 0 0 17402064 0 0 0 520360 14422 11115 6 15 38 42 0 0 17496904 0 0 0 504674 11241 6857 5 13 35 47 0 0 17697036 0 340 0 506400 13533 8592 5 16 42 37 0 0 17709488 0 156 0 507366 4996 6369 5 12 37 47 0 0 17827852 0 2224 0 525008 5298 7279 5 14 32 48 0 0 17825784 0 900 0 506971 5072 5265 5 12 39 44 0 0 17829956 0 1912 0 521112 9801 46663 13 22 28 38 0 0 17868648 32 2180 32 426195 9324 61442 16 26 28 30 0 0 17871324 0 592 0 473754 6685 59156 17 21 30 32 0 0 17874700 0 1516 0 411058 7274 63749 17 25 32 26 0 0 17846520 312 408 312 409954 7237 64255 17 25 31 27 0 0 17882168 728 972 728 361001 8018 65690 17 28 29 26 0 0 17856652 908 796 908 458940 7835 59056 15 28 33 23 0

This definitively concludes that we are running out of memory - by hot much, I'm not sure, but I think we can get this done with less than 16G per master.

dantrainor commented 5 years ago

Note that the aforementioned memory counters are running during the OpenShift Installer:

evel=debug msg="State path: /tmp/kni-install-158504556/terraform.tfstate" level=debug msg="OpenShift Installer 0447ab1fa360139aef298a1a8bc17fdf6302caec" level=debug msg="Built from commit 0447ab1fa360139aef298a1a8bc17fdf6302caec" level=info msg="Waiting up to 30m0s for the Kubernetes API at https://api.ostest.test.metalkube.org:6443..." level=debug msg="Still waiting for the Kubernetes API: Get https://api.ostest.test.metalkube.org:6443/version?timeout=32s: dial tcp 192.168.111.5:6443: connect: connection refused" level=debug msg="Still waiting for the Kubernetes API: Get https://api.ostest.test.metalkube.org:6443/version?timeout=32s: dial tcp 192.168.111.5:6443: connect: connection refused" level=debug msg="Still waiting for the Kubernetes API: Get https://api.ostest.test.metalkube.org:6443/version?timeout=32s: dial tcp 192.168.111.5:6443: connect: connection refused" level=debug msg="Still waiting for the Kubernetes API: Get https://api.ostest.test.metalkube.org:6443/version?timeout=32s: dial tcp 192.168.111.5:6443: connect: connection refused" level=debug msg="Still waiting for the Kubernetes API: Get https://api.ostest.test.metalkube.org:6443/version?timeout=32s: dial tcp 192.168.111.5:6443: connect: connection refused" level=debug msg="Still waiting for the Kubernetes API: Get https://api.ostest.test.metalkube.org:6443/version?timeout=32s: dial tcp 192.168.111.5:6443: connect: connection refused" level=debug msg="Still waiting for the Kubernetes API: Get https://api.ostest.test.metalkube.org:6443/version?timeout=32s: dial tcp 192.168.111.5:6443: connect: connection refused" level=debug msg="Still waiting for the Kubernetes API: Get https://api.ostest.test.metalkube.org:6443/version?timeout=32s: dial tcp 192.168.111.5:6443: connect: connection refused" level=debug msg="Still waiting for the Kubernetes API: Get https://api.ostest.test.metalkube.org:6443/version?timeout=32s: dial tcp 192.168.111.5:6443: connect: connection refused"

The system starts swapping, things slow down, causing timeouts, if not simply OOMing. This is likely the cause of the transient errors, the "sometimes it works sometimes it doesn't" failures.

openshift-metal3 / dev-scripts

increase default VM RAM requirement to match AWS infrastructure requirements #331