Closed dmcnaught closed 5 years ago
Sounds like the API must have changed. We will need to look at it and figure out the fix.
K8s 1.9 made v2beta1 of autoscaling the default version (https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.9.md/#other-notable-changes-11), but kops didn't seem to do this: https://github.com/kubernetes/kops/blob/master/docs/horizontal_pod_autoscaling.md#support-for-multiple-metrics Since deis is creating an autoscaling/v1 definition, I'm confused as to why it would be failing. K8s shouldn't be making breaking changes to an existing api...
@itskingori I saw you wrote the kops horizontal pod autoscaler docs, do you have time to review and comment on this?
@dmcnaught If you're using kops 1.9.0, then the relevant flags should be set already. Make sure you have this though:
spec:
kubeControllerManager:
horizontalPodAutoscalerUseRestClients: true
kubeAPIServer:
runtimeConfig:
autoscaling/v2beta1: "true"
That said, what version of metrics server do you have installed? I hope it's this guy.
@itskingori - thanks for taking a look.
Since we don't want to use multiple metrics, I thought we could just continue using the autoscaling/v1 and not set that section on our cluster spec. Is that correct?
Yes - we are adding kubectl apply -f https://raw.githubusercontent.com/kubernetes/kops/master/addons/metrics-server/v1.8.x.yaml - although the docs seem to indicate that we shouldn't need it - I opened a ticket for that: https://github.com/kubernetes/kops/issues/5033
Since we don't want to use multiple metrics, I thought we could just continue using the autoscaling/v1 and not set that section on our cluster spec. Is that correct?
Right. Correct.
the HPA controller was unable to get the target's current scale: no matches for /, Kind=Deployment
This does seem like a strange error to me. 🤔
@dmcnaught Could you share the result of api version of your deployment i.e test-cmd
? This part ...
apiVersion: apps/v1beta2
kind: Deployment
And the HPA, this part ...
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
spec:
scaleTargetRef:
apiVersion: apps/v1beta1
kind: Deployment
And the api version enabled on your cluster? In my case I have this ...
$ kubectl api-versions
apiextensions.k8s.io/v1beta1
apiregistration.k8s.io/v1beta1
apps/v1beta1
apps/v1beta2
authentication.k8s.io/v1
authentication.k8s.io/v1beta1
authorization.k8s.io/v1
authorization.k8s.io/v1beta1
autoscaling/v1
autoscaling/v2beta1
batch/v1
batch/v1beta1
batch/v2alpha1
certificates.k8s.io/v1beta1
extensions/v1beta1
metrics.k8s.io/v1beta1
networking.k8s.io/v1
policy/v1beta1
rbac.authorization.k8s.io/v1
rbac.authorization.k8s.io/v1alpha1
rbac.authorization.k8s.io/v1beta1
storage.k8s.io/v1
storage.k8s.io/v1beta1
v1
Make sure your deployments apiVersion
is enabled. 🤔
Sure.
>kubectl -n test get deployment -oyaml
apiVersion: v1
items:
- apiVersion: extensions/v1beta1
kind: Deployment
kubectl -n test get hpa -oyaml
apiVersion: v1
items:
- apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
annotations:
autoscaling.alpha.kubernetes.io/conditions: '[{"type":"AbleToScale","status":"False","lastTransitionTime":"2018-04-20T14:34:45Z","reason":"FailedGetScale","message":"the
HPA controller was unable to get the target''s current scale: no matches for
/, Kind=Deployment"}]'
creationTimestamp: 2018-04-20T14:34:15Z
labels:
app: test
heritage: deis
type: cmd
name: test-cmd
namespace: test
resourceVersion: "322851"
selfLink: /apis/autoscaling/v1/namespaces/test/horizontalpodautoscalers/test-cmd
uid: e684fe85-44a7-11e8-bdc0-0eede6d2047e
spec:
maxReplicas: 4
minReplicas: 1
scaleTargetRef:
kind: Deployment
name: test-cmd
targetCPUUtilizationPercentage: 10
kubectl api-versions 130 ↵
admissionregistration.k8s.io/v1beta1
apiextensions.k8s.io/v1beta1
apiregistration.k8s.io/v1beta1
apps/v1
apps/v1beta1
apps/v1beta2
authentication.k8s.io/v1
authentication.k8s.io/v1beta1
authorization.k8s.io/v1
authorization.k8s.io/v1beta1
autoscaling/v1
autoscaling/v2beta1
batch/v1
batch/v1beta1
certificates.k8s.io/v1beta1
events.k8s.io/v1beta1
extensions/v1beta1
metrics.k8s.io/v1beta1
monitoring.coreos.com/v1
networking.k8s.io/v1
policy/v1beta1
rbac.authorization.k8s.io/v1
rbac.authorization.k8s.io/v1beta1
storage.k8s.io/v1
storage.k8s.io/v1beta1
v1
@dmcnaught Your deployment doesn't have apiVersion
in the scaleTargetRef
, see:
- apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
spec:
scaleTargetRef:
kind: Deployment
name: test-cmd
I'm thinking that since you're using 1.9 the default kind for Deployment is not extensions/v1beta1
(which is what you have on your deployment). Maybe try one of these:
apiVersion: apps/v1beta2
(because that's the API group/version that the HPA will look for).scaleTargetRef
to set apiVersion: extensions/v1beta1
so that the HPA looks for the deployment in that group/version (which is where your deployment is at the moment).Also check these out:
Hi @itskingori,
In my original post I mentioned that I added 'apiVersion: extensions/v1beta1' to scaleTargetRef
and it didn't help. I tried it again though and this time it worked... maybe I didn't wait long enough last time - so that works.
For the first recommendation - do you mean change the deployment from apiVersion: extensions/v1beta1
to apiVersion: apps/v1beta2
- or should I still keep it as extensions rather than apps...?
Thanks for figuring it out!
@dmcnaught ...
For the first recommendation - do you mean change the deployment from
apiVersion: extensions/v1beta1
toapiVersion: apps/v1beta2
...
Either. Theoretically both should work. What we're seeking here is the same version on the HPA that the Deployment is defined as. The HPA cannot be looking for the Deployment in an API group/version that it is not in.
... - or should I still keep it as extensions rather than apps...?
I recommend changing your Deployment to the apiVersion
that your version uses by default. For 1.9 looks like it's actually apps/v1
In 1.8, Deployments graduated to apps/v1beta2
:
In the 1.8 release, we introduce the apps/v1beta2 API group and version. This beta version of the core Workloads API contains the Deployment, DaemonSet, ReplicaSet, and StatefulSet kinds, and it is the version we plan to promote to GA in the 1.9 release provided the feedback is positive.
In 1.9 Deployments graduated to the apps/v1
group version but apps/v1beta2
is still supported :
In the 1.9 release, we plan to introduce the apps/v1 group version. We intend to promote the apps/v1beta2 group version in its entirety to apps/v1 and to deprecate apps/v1beta2 at that time.
Your extensions/v1beta1
Deployment is working because each version maintains backwards compatibility, for a time:
We realize that even after the release of apps/v1, users will need time to migrate their code from extensions/v1beta1, apps/v1beta1, and apps/v1beta2. It is important to remember that the minimum support durations listed in the deprecations guidelines are minimums. We will continue to support conversion between groups and versions until users have had sufficient time to migrate.
See https://v1-10.docs.kubernetes.io/docs/reference/workloads-18-19/ for details.
Thanks! Since we've been using deis to deploy most of our apps, I haven't kept in touch with api versions very well. The interaction here between hpa and deployments means that backwards compatibility is a little more complex in this case.
@Cryptophobia I think this issue could open a can of worms that will need addressing sooner rather than later - the apiVersions used by deis - many of them are probably deprecated now (like in this issue workaround) and hephy should be updated to use the latest k8s apiVersions.. probably one of the highest priority tasks. @kingdonb
In the latest kubectl versions we've heard that kubectl api-versions
returns just
v1
...as in literally api: map[v1:{}]
-- no values in the map at all, just a single v1
key. This is what you should expect to find on k8s clusters at >v1.10
This PR was an approach that didn't always work: https://github.com/teamhephy/controller/pull/72/files because GKE does not reliably report on KubeVersion in a semver compliant way.
Check out this one: https://github.com/teamhephy/controller/pull/73/files
This is what's in Hephy v2.19.4 now. It reads the contents of .Capabilities.APIVersions
directly and uses the same approach we found in Prometheus Operator. This has worked on every cluster we've tested against.
https://github.com/coreos/prometheus-operator/issues/1714
This still seems to be the same approach they're using today: https://github.com/coreos/prometheus-operator/blob/master/helm/alertmanager/templates/psp-clusterrole.yaml
Does that help @dmcnaught ? 🥇
I'm still seeing the problem with deis autoscale not working in hephy 2.19.4 on K8s 1.10 - both fresh installs @kingdonb
So there may be more places that opt into or check for particular API versions than we've identified. I'm thinking they will be in the code, rather than inside of the chart templates.
@dmcnaught :+1:
Looks like the issue here was not that Hephy was using the wrong HPA api version, but that deployments api compatibility has changed.
In the controller's HPA code we have this:
def api_version(self):
# API location changes between versions
# http://kubernetes.io/docs/user-guide/horizontal-pod-autoscaling/#api-object
if self.version() >= parse("1.3.0"):
return 'autoscaling/v1'
# 1.2 and older
return 'extensions/v1beta1'
HPA is already assuming autoscaling/v1
.
Looks like in the controller Deployment code we have this:
`class Deployment(Resource):
api_prefix = 'apis'
api_version = 'extensions/v1beta1'
Sounds that this will need to be migrated to apps/v1
. What are the implications in terms of backwards compatibility and such? I am really not sure.
Sounds like a feature for v2.20 (it's a breaking change, I'm comfortable breaking compatibility with older than k8s v1.3 at this point, if that's what this means... unless anyone has strong objections...).
Alternatively if there is someone who maintains a much older k8s cluster with horizontal pod autoscaling and wants to help us confirm that we haven't broken it with the new release, then we could do that. My oldest cluster is at v1.5 and actually it looks like we just need a cluster <v1.9
so that might work, since if I'm reading this right, that's when deployment became apps/v1
.
I don't have strong feelings about maintaining backwards compatibility forever. I'd much prefer that somebody forces me to upgrade my v1.5 cluster. The only reason I kept it at v1.5 was because I feared the implications of turning on RBAC... and at this time, even standard basic minikube installs come with RBAC enabled. (That v1.5 cluster wasn't running workflow either, so I wouldn't be harmed or feeling "put out" by such a change in the slightest.)
I think we would be better served by making the small breaking change and adding a prominent note about the minimum supported version being at v1.9, pushing our users toward the future. I consider that it is a relatively recent version, but on the other hand, the oldest version that you can even request to install on GKE today is v1.9.6.
I think we would be better served by making the small breaking change and adding a prominent note about the minimum supported version being at v1.9, pushing our users toward the future. I consider that it is a relatively recent version, but on the other hand, the oldest version that you can even request to install on GKE today is v1.9.6.
I think this is good. We should make notes about what minimum version we support if we go forward and break backwards compatibility for sure.
This is fixed in https://github.com/teamhephy/controller/pull/106
When I create an hpa with deis autoscale I get this error:
when I set the autoscale with kubernetes, I don't get that error:
When I get the hpa -oyaml, the difference seems to be spec: maxReplicas: 4 minReplicas: 1 scaleTargetRef: apiVersion: extensions/v1beta1 and when I add that line to the deis-created hpa, it doesn't fix the problem...