fission / fission

Fast and Simple Serverless Functions for Kubernetes
https://fission.io
Apache License 2.0
8.42k stars 785 forks source link

Fission functions not autoscaling on Minikube #1182

Closed LennertMertens closed 5 years ago

LennertMertens commented 5 years ago

When deploying functions on Minikube the functions don't scale when calling them with an increased HTTP load.

Metrics-server is enabled

Versions: Minikube: v1.0.0 Metrics Server: v0.2.1

➜  ~ kubectl -n fission-function get hpa -w
NAME                                         REFERENCE                                               TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
newdeploy-hello-default-juv8fbdt             Deployment/newdeploy-hello-default-juv8fbdt             <unknown>/80%   1         1         1          25m
newdeploy-serverless-demo-default-sqgblnbc   Deployment/newdeploy-serverless-demo-default-sqgblnbc   <unknown>/80%   1         1         1          18h
➜  ~ kubectl describe hpa newdeploy-hello-default-juv8fbdt -n fission-function
Name:                                                  newdeploy-hello-default-juv8fbdt
Namespace:                                             fission-function
Labels:                                                environmentName=node-env
                                                       environmentNamespace=default
                                                       environmentUid=a335dae9-73d7-11e9-a20a-080027b29483
                                                       executorInstanceId=blxnpe36
                                                       executorType=newdeploy
                                                       functionName=hello
                                                       functionNamespace=default
                                                       functionUid=c938271c-73d7-11e9-a20a-080027b29483
Annotations:                                           <none>
CreationTimestamp:                                     Sat, 11 May 2019 12:30:28 +0200
Reference:                                             Deployment/newdeploy-hello-default-juv8fbdt
Metrics:                                               ( current / target )
  resource cpu on pods  (as a percentage of request):  <unknown> / 80%
Min replicas:                                          1
Max replicas:                                          1
Deployment pods:                                       1 current / 0 desired
Conditions:
  Type           Status  Reason                   Message
  ----           ------  ------                   -------
  AbleToScale    True    SucceededGetScale        the HPA controller was able to get the target's current scale
  ScalingActive  False   FailedGetResourceMetric  the HPA was unable to compute the replica count: missing request for cpu
Events:
  Type     Reason                        Age                 From                       Message
  ----     ------                        ----                ----                       -------
  Warning  FailedGetResourceMetric       24m (x6 over 26m)   horizontal-pod-autoscaler  unable to get metrics for resource cpu: no metrics returned from resource metrics API
  Warning  FailedComputeMetricsReplicas  24m (x6 over 26m)   horizontal-pod-autoscaler  failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from resource metrics API
  Warning  FailedGetResourceMetric       6m (x74 over 24m)   horizontal-pod-autoscaler  missing request for cpu
  Warning  FailedComputeMetricsReplicas  55s (x94 over 24m)  horizontal-pod-autoscaler  failed to get cpu utilization: missing request for cpu

Anyone encountered the same issue?

life1347 commented 5 years ago

Hi @LennertMertens did you enable minikube addon metrics-server? https://kubernetes.io/docs/tutorials/hello-minikube/#enable-addons

LennertMertens commented 5 years ago

@life1347 I did enable it as you see in the output

➜  custom-docker-kubeless minikube addons list
$- addon-manager: enabled
- dashboard: disabled
- default-storageclass: enabled
- efk: disabled
- freshpod: disabled
- gvisor: disabled
- heapster: enabled
- ingress: disabled
- logviewer: disabled
- metrics-server: enabled
- nvidia-driver-installer: disabled
- nvidia-gpu-device-plugin: disabled
- registry: disabled
- registry-creds: disabled
- storage-provisioner: enabled
- storage-provisioner-gluster: disabled
➜  custom-docker-kubeless kubectl get svc -n kube-system
NAME                  TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                  AGE
heapster              ClusterIP   10.111.56.122   <none>        80/TCP                   3h4m
kube-dns              ClusterIP   10.96.0.10      <none>        53/UDP,53/TCP,9153/TCP   24h
metrics-server        ClusterIP   10.108.8.81     <none>        443/TCP                  24h
monitoring-grafana    NodePort    10.98.164.195   <none>        80:30002/TCP             3h4m
monitoring-influxdb   ClusterIP   10.103.106.64   <none>        8083/TCP,8086/TCP        3h4m
tiller-deploy         ClusterIP   10.110.115.55   <none>        44134/TCP                23h
life1347 commented 5 years ago

kubectl -n fission-function get hpa still shows <unknown>/80% after some time? Normally, unknown means either 1) metric server is not enabled 2) metric server is collecting metrics. Since you already enabled the metric server, looks like the problem is the latter one.

LennertMertens commented 5 years ago

Yes it's been on for a while now, I tried several things yet but I'm still getting this:


➜  ~ kubectl -n fission-function get hpa
NAME                                         REFERENCE                                               TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
newdeploy-hello-default-juv8fbdt             Deployment/newdeploy-hello-default-juv8fbdt             <unknown>/80%   1         1         1          4h14m
newdeploy-serverless-demo-default-sqgblnbc   Deployment/newdeploy-serverless-demo-default-sqgblnbc   <unknown>/80%   1         1         1          22h ```
life1347 commented 5 years ago

Could you share the output of kubectl get pod -n kube-system

life1347 commented 5 years ago

I guess I found out the problem

No metrics for pod fission-function/newdeploy-n1-default-n3n2vtex-579f6dcd88-7bpvs

Will take a look and reply to this thread later

LennertMertens commented 5 years ago

The output you requested, and yes I get the same message

➜  hello-openfaas kubectl get pod -n kube-system
NAME                               READY   STATUS    RESTARTS   AGE
coredns-fb8b8dccf-8789v            1/1     Running   0          29h
coredns-fb8b8dccf-m5v9z            1/1     Running   0          29h
etcd-minikube                      1/1     Running   0          29h
kube-addon-manager-minikube        1/1     Running   0          29h
kube-apiserver-minikube            1/1     Running   0          29h
kube-controller-manager-minikube   1/1     Running   0          29h
kube-proxy-txk8b                   1/1     Running   0          29h
kube-scheduler-minikube            1/1     Running   0          29h
metrics-server-77fddcc57b-xbnvf    1/1     Running   4          29h
storage-provisioner                1/1     Running   0          29h
tiller-deploy-8458f6c667-697h4     1/1     Running   0          28h

Thanks!

life1347 commented 5 years ago

Test with HPA example in official doc. HPA shows unkown at first.

$ kubectl get hpa
NAME         REFERENCE               TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
php-apache   Deployment/php-apache   <unknown>/50%   1         10        1          3m3s

However, after some period, the unknown changes to actual CPU utilization and still keep getting No metrics for pod log in metrics-server.

$ kubectl get hpa
NAME         REFERENCE               TARGETS    MINPODS   MAXPODS   REPLICAS   AGE
php-apache   Deployment/php-apache   419%/50%   1         10        8          4m52s

Not very familiar how metrics-server works internally, but I think the problem is because CPU usage of the function pod is too low and metrics-server couldn't collect enough data points to calculate average CPU usage.

To trigger the autoscaling, you may need to

  1. figure out some ways to generate traffic to function
  2. set --maxscale larger than 1 when creating function, see here

Since this issue is not related to fission, I'm going to close this issue. Feel free to reopen it.