elotl / kip

Virtual-kubelet provider running pods in cloud instances
Apache License 2.0
223 stars 14 forks source link

GCE Custom Instance Selector: Invalid amount of memory chosen #171

Closed justnoise closed 4 years ago

justnoise commented 4 years ago

Looks like the instanceSelector creates invalid custom instances in GCE. When running a pod in GCE with the following resource requests/limits:

resources:
  limits:
    cpu: "1"
    memory: 500Mi
  requests:
    cpu: 100m
    memory: 200Mi

We get the following error:

E0827 23:32:46.151305       1 node_controller.go:287] Error in node start: startup error: googleapi: Error 400: Invalid value for field 'resource.machineType': 'https://www.googleapis.com/compute/v1/projects/elotl-dev/zones/us-west1-b/machineTypes/n1-custom-1-921'. Memory should be a multiple of 256MiB, while 921MiB is requested, invalid
I0827 23:32:46.151387       1 node_registry.go:218] Purging node &{{Node v1} {69ed9da5-87e0-4ac2-a5fc-b14bc7208def map[] 2020-08-27 23:32:45.33021287 +0000 UTC <nil> map[] 3bee1f5f-0c4e-49cd-9410-ee3ea9102288 default} {n1-custom-1-921 elotl-kip-latest false false {1.00 0.49Gi  10G false 0xc00080b2cc false <nil>}} {Creating  [] kube-system_fluentd-gke-2x2bt}}

I've confirmed this with the following test case in TestGCEResourcesToInstanceType:

{
    Resources:    api.ResourceSpec{Memory: "0.5Gi", CPU: "1.0"},
    instanceType: "n1-custom-1-921",
},

This issue is a problem since one of the daemonSets in GKE creates a pod with these resource limits.

ldx commented 4 years ago

That is 0.9GB of memory and comes from the script generating instance data: https://github.com/elotl/kip/blob/master/scripts/create_instance_data/create_gce_instance_data.py#L33

https://cloud.google.com/compute/docs/instances/creating-instance-with-custom-machine-type#n1_custom_machine_types says "For N1 machine types, select between 0.9 GB and 6.5 GB per vCPU, inclusive", but apparently an additional requirement is that it still needs to be a multiple of 0.25GB.