kubernetes-sigs / external-dns

Configure external DNS servers (AWS Route53, Google CloudDNS and others) for Kubernetes Ingresses and Services
Apache License 2.0
7.73k stars 2.57k forks source link

ExternalDNS does not support K8s 1.22 #2168

Closed robbiezhang closed 3 years ago

robbiezhang commented 3 years ago

What would you like to be added: Support k8s 1.22

Why is this needed: k8s 1.22 stops to support a couple deprecated APIs, including the Ingress in extensions/v1beta1 https://kubernetes.io/docs/reference/using-api/deprecation-guide/#ingress-v122

I haven't verified but suspect that external-dns will stop working since it uses IngressInformer from k8s.io/client-go/informers/extensions/v1beta1

https://github.com/kubernetes-sigs/external-dns/blob/c3a28ecc65af1b1e597546a571e801954d827108/source/ingress.go#L33

spirosoik commented 3 years ago

It's true but still it's an RC

maybe-sybr commented 3 years ago

I can confirm that using a 1.22 k8s server breaks with the following output:

sh-5.1$ kubectl version
Client Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.5", GitCommit:"6b1d87acf3c8253c123756b9e61dac642678305f", GitTreeState:"archive", BuildDate:"2021-03-30T00:00:00Z", GoVersion:"go1.16", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"22+", GitVersion:"v1.22.0-beta.0", GitCommit:"a3f24e8459465495738af1b9cc6c3db80696e3c1", GitTreeState:"clean", BuildDate:"2021-06-22T21:00:26Z", GoVersion:"go1.16.5", Compiler:"gc", Platform:"linux/amd64"}
sh-5.1$ kubectl get pods -A | grep external-dns
kube-system   external-dns-5ffddbb8fc-l6ztj                     0/1     CrashLoopBackOff   3 (11s ago)   5m10s
sh-5.1$ kubectl logs -n kube-system   external-dns-5ffddbb8fc-l6ztj
time="2021-08-04T01:45:38Z" level=info msg="config: {APIServerURL: KubeConfig: RequestTimeout:30s ContourLoadBalancerService:heptio-contour/contour GlooNamespace:gloo-system SkipperRouteGroupVersion:zalando.org/v1 Sources:[service ingress] Namespace: AnnotationFilter: LabelFilter: FQDNTemplate: CombineFQDNAndAnnotation:false IgnoreHostnameAnnotation:false IgnoreIngressTLSSpec:false Compatibility: PublishInternal:false PublishHostIP:false AlwaysPublishNotReadyAddresses:false ConnectorSourceServer:localhost:8080 Provider:coredns GoogleProject: GoogleBatchChangeSize:1000 GoogleBatchChangeInterval:1s DomainFilter:[] ExcludeDomains:[] RegexDomainFilter: RegexDomainExclusion: ZoneNameFilter:[] ZoneIDFilter:[] AlibabaCloudConfigFile:/etc/kubernetes/alibaba-cloud.json AlibabaCloudZoneType: AWSZoneType: AWSZoneTagFilter:[] AWSAssumeRole: AWSBatchChangeSize:1000 AWSBatchChangeInterval:1s AWSEvaluateTargetHealth:true AWSAPIRetries:3 AWSPreferCNAME:false AWSZoneCacheDuration:0s AzureConfigFile:/etc/kubernetes/azure.json AzureResourceGroup: AzureSubscriptionID: AzureUserAssignedIdentityClientID: BluecatConfigFile:/etc/kubernetes/bluecat.json CloudflareProxied:false CloudflareZonesPerPage:50 CoreDNSPrefix:/skydns/ RcodezeroTXTEncrypt:false AkamaiServiceConsumerDomain: AkamaiClientToken: AkamaiClientSecret: AkamaiAccessToken: AkamaiEdgercPath: AkamaiEdgercSection: InfobloxGridHost: InfobloxWapiPort:443 InfobloxWapiUsername:admin InfobloxWapiPassword: InfobloxWapiVersion:2.3.1 InfobloxSSLVerify:true InfobloxView: InfobloxMaxResults:0 DynCustomerName: DynUsername: DynPassword: DynMinTTLSeconds:0 OCIConfigFile:/etc/kubernetes/oci.yaml InMemoryZones:[] OVHEndpoint:ovh-eu OVHApiRateLimit:20 PDNSServer:http://localhost:8081 PDNSAPIKey: PDNSTLSEnabled:false TLSCA: TLSClientCert: TLSClientCertKey: Policy:upsert-only Registry:txt TXTOwnerID:default TXTPrefix: TXTSuffix: Interval:1m0s MinEventSyncInterval:5s Once:false DryRun:false UpdateEvents:false LogFormat:text MetricsAddress::7979 LogLevel:debug TXTCacheInterval:0s TXTWildcardReplacement: ExoscaleEndpoint:https://api.exoscale.ch/dns ExoscaleAPIKey: ExoscaleAPISecret: CRDSourceAPIVersion:externaldns.k8s.io/v1alpha1 CRDSourceKind:DNSEndpoint ServiceTypeFilter:[] CFAPIEndpoint: CFUsername: CFPassword: RFC2136Host: RFC2136Port:0 RFC2136Zone: RFC2136Insecure:false RFC2136GSSTSIG:false RFC2136KerberosRealm: RFC2136KerberosUsername: RFC2136KerberosPassword: RFC2136TSIGKeyName: RFC2136TSIGSecret: RFC2136TSIGSecretAlg: RFC2136TAXFR:false RFC2136MinTTL:0s NS1Endpoint: NS1IgnoreSSL:false NS1MinTTLSeconds:0 TransIPAccountName: TransIPPrivateKeyFile: DigitalOceanAPIPageSize:50 ManagedDNSRecordTypes:[A CNAME] GoDaddyAPIKey: GoDaddySecretKey: GoDaddyTTL:0 GoDaddyOTE:false}"
time="2021-08-04T01:45:38Z" level=info msg="Instantiating new Kubernetes client"
time="2021-08-04T01:45:38Z" level=debug msg="apiServerURL: "
time="2021-08-04T01:45:38Z" level=debug msg="kubeConfig: "
time="2021-08-04T01:45:38Z" level=info msg="Using inCluster-config based on serviceaccount-token"
time="2021-08-04T01:45:38Z" level=info msg="Created Kubernetes client https://10.0.0.1:443"
time="2021-08-04T01:46:39Z" level=fatal msg="failed to sync cache: timed out waiting for the condition"
Deployed using a YAML doc from `helm template` using the bitnami chart (click to expand): ```yaml --- # Source: external-dns/templates/serviceaccount.yaml apiVersion: v1 kind: ServiceAccount metadata: name: external-dns namespace: kube-system labels: app.kubernetes.io/name: external-dns helm.sh/chart: external-dns-5.2.3 app.kubernetes.io/instance: external-dns app.kubernetes.io/managed-by: Helm automountServiceAccountToken: true --- # Source: external-dns/templates/clusterrole.yaml apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: external-dns labels: app.kubernetes.io/name: external-dns helm.sh/chart: external-dns-5.2.3 app.kubernetes.io/instance: external-dns app.kubernetes.io/managed-by: Helm rules: - apiGroups: - "" resources: - services - pods - nodes - endpoints verbs: - get - list - watch - apiGroups: - extensions - "networking.k8s.io" - getambassador.io resources: - ingresses - hosts verbs: - get - list - watch - apiGroups: - route.openshift.io resources: - routes verbs: - get - list - watch - apiGroups: - networking.istio.io resources: - gateways - virtualservices verbs: - get - list - watch - apiGroups: - zalando.org resources: - routegroups verbs: - get - list - watch - apiGroups: - zalando.org resources: - routegroups/status verbs: - patch - update - apiGroups: - projectcontour.io resources: - httpproxies verbs: - get - watch - list - apiGroups: - gloo.solo.io - gateway.solo.io resources: - proxies - virtualservices verbs: - get - list - watch --- # Source: external-dns/templates/clusterrolebinding.yaml apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: external-dns labels: app.kubernetes.io/name: external-dns helm.sh/chart: external-dns-5.2.3 app.kubernetes.io/instance: external-dns app.kubernetes.io/managed-by: Helm roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: external-dns subjects: - kind: ServiceAccount name: external-dns namespace: kube-system --- # Source: external-dns/templates/service.yaml apiVersion: v1 kind: Service metadata: name: external-dns namespace: kube-system labels: app.kubernetes.io/name: external-dns helm.sh/chart: external-dns-5.2.3 app.kubernetes.io/instance: external-dns app.kubernetes.io/managed-by: Helm spec: ports: - name: http port: 7979 protocol: TCP targetPort: http selector: app.kubernetes.io/name: external-dns app.kubernetes.io/instance: external-dns type: ClusterIP --- # Source: external-dns/templates/deployment.yaml apiVersion: apps/v1 kind: Deployment metadata: name: external-dns namespace: kube-system labels: app.kubernetes.io/name: external-dns helm.sh/chart: external-dns-5.2.3 app.kubernetes.io/instance: external-dns app.kubernetes.io/managed-by: Helm spec: replicas: 1 selector: matchLabels: app.kubernetes.io/name: external-dns app.kubernetes.io/instance: external-dns template: metadata: labels: app.kubernetes.io/name: external-dns helm.sh/chart: external-dns-5.2.3 app.kubernetes.io/instance: external-dns app.kubernetes.io/managed-by: Helm annotations: spec: securityContext: fsGroup: 1001 runAsUser: 1001 affinity: podAffinity: podAntiAffinity: preferredDuringSchedulingIgnoredDuringExecution: - podAffinityTerm: labelSelector: matchLabels: app.kubernetes.io/name: external-dns app.kubernetes.io/instance: external-dns namespaces: - "kube-system" topologyKey: kubernetes.io/hostname weight: 1 nodeAffinity: serviceAccountName: external-dns containers: - name: external-dns image: "docker.io/bitnami/external-dns:0.8.0-debian-10-r73" imagePullPolicy: "IfNotPresent" args: # Generic arguments - --log-level=debug - --log-format=text - --policy=upsert-only - --provider=coredns - --registry=txt - --interval=1m - --source=service - --source=ingress env: # CoreDNS environment variables - name: ETCD_URLS value: "http://10.4.2.251:2379" ports: - name: http containerPort: 7979 readinessProbe: failureThreshold: 6 httpGet: path: /healthz port: http initialDelaySeconds: 5 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 5 livenessProbe: failureThreshold: 2 httpGet: path: /healthz port: http initialDelaySeconds: 10 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 5 resources: limits: {} requests: {} volumeMounts: volumes: ```

Given that the 1.22 release is imminent (ie. today) [0], it'd be great if this could be fixed. I'm also keeping an eye on some misbehaviours over in kubernetes/ingress-nginx and they've made the choice to drop support for < 1.19 in their 1.0.0 ingress controller release. I imagine a similar choice will need to be thought about here.

[0] https://github.com/kubernetes/sig-release/tree/master/releases/release-1.22

maybe-sybr commented 3 years ago

ping @seanmalloy @njuettner @Raffo, looks like you are active maintainers here.

There's potentially relevant previous discussion at #1931 which was closed since k3s was possibly removing stuff early. Not sure if that misbehaviour was related to this ingress informer issue though. A handful of other changes preparing for 1.22 have been merged in #2001, #2012 , #2120 . Any chance of getting this one sorted and cutting a 0.9 or 1.0 release?

Raffo commented 3 years ago

I am not sure that this is related to the new API version actually. The error in your example seems permission related 🤔

scholzj commented 3 years ago

@Raffo I run into the same problem as well with 0.8.0 (didn't try with 0.9.0 yet since the container image was not available). The same RBAC configuration seemed to work fine for me on 1.21 but has shown this error on 1.22.

alainvanhoof commented 3 years ago

After Upgrading a working 1.21.2 cluster to 1.22 external-dns started showing the above metioned error and stopped working. It looks like the source/ingress.go that is trying to get info from the resource is still requesting v1beta1 API while only v1 is available ( extinformers "k8s.io/client-go/informers/extensions/v1beta1" )

maybe-sybr commented 3 years ago

I am not sure that this is related to the new API version actually. The error in your example seems permission related thinking

Any tips on how I could get some further information to help diagnose the issue, @raffo? I've got the source but have been a bti reluctant to start trying to build it with extra debugging output. The current usage of the v1beta1 ingress interface is definitely going to cause issues, but it's entirely possible there are other things getting in the way as well.

My current premise is that this code is to blame: https://github.com/kubernetes-sigs/external-dns/blob/c88486e9c7c193b00d7560ff630b66e75f915164/source/ingress.go#L99-L104 where: https://github.com/kubernetes-sigs/external-dns/blob/c88486e9c7c193b00d7560ff630b66e75f915164/source/ingress.go#L85

However, that error message text is use by all source implementations (grep -r 'failed to sync cache' shows a bunch of files under source/). For reference, I'm using both the ingress and service sources for my external-dns deployment. That's the default for the bitnami chart and I'm not overriding it:

https://github.com/bitnami/charts/blob/34f27c7d8dc3fd19ef88f04159ab4b92381a78b5/bitnami/external-dns/values.yaml#L61-L67

rjosephwright commented 3 years ago

I did a quick fix for this at https://github.com/rjosephwright/external-dns/commit/52823b41c1c448d071a8dd05a9d1dbc487899129. It's not perfect as it has introduced a failure in the tests for the gloo proxy source, which I didn't have time to troubleshoot. I did verify that I am not seeing the "failed to sync cache" issue with this change.

Raffo commented 3 years ago

@rjosephwright I am working on a general bump of dependencies and will look into this as well, thank you for providing more information. It will take a bit to be fixed (ExternalDNS isn't really my daily job 😅 ) so for now I recommend avoiding 1.22.

I will rename this issue and pin it for maximum visibility.

agilgur5 commented 3 years ago

The error in your example seems permission related

For reference, this same error seems to get printed out when a CRD doesn't exist. https://github.com/kubernetes-sigs/external-dns/issues/961#issuecomment-881967776 suggested this was the case when you add Istio sources but Istio CRDs don't exist and my team has been able to confirm this is the case. (mentioned this in https://github.com/kubernetes-sigs/external-dns/issues/2100#issuecomment-888365851 as well)

So if that error is for either permissions or resource non-existence issues, that would make sense for this issue as well since Ingress under extensions/v1beta1 no longer exists in k8s v1.22

mattmichal commented 3 years ago

Related to this is lack of support for the ingressClassName field mentioned in #1792. The annotation that is used today for finding eligible ingress resources has been deprecated since 1.18.

mstrohl commented 3 years ago

Hello, thanks to @rjosephwright i built an image without tests because i've seen that my source (OVH) still working with all dependencies errors. thanks @Raffo for the time you spend because there are a lot of impacts. For those who can't wait to try an image that coul possibly work you can try in the repository smartdeploy/external-dns:0.0.1. It has been build from the @rjosephwright commit.

masterkain commented 3 years ago

@mstrohl I had some success with smartdeploy/external-dns:0.0.1 image, I only have a slight spamming of

W0911 12:20:13.950956 1 transport.go:260] Unable to cancel request for *instrumented_http.Transport

but beside that I was able to update my records on cloudflare, thanks

alainvanhoof commented 3 years ago

Build the current PR https://github.com/kubernetes-sigs/external-dns/pull/2281 which indeed works for 1.22.1 and PowerDNS but has the same warning W0911 12:20:13.950956 1 transport.go:260] Unable to cancel request for *instrumented_http.Transport as smartdeploy/external-dns:0.0.1

maybe-sybr commented 3 years ago

Thanks for getting #2281 merged, @Raffo and @andrewstuart

mstrohl commented 3 years ago

Updating pod with this image working well : docker pull smartdeploy/external-dns:latest (made from https://github.com/kubernetes-sigs/external-dns/pull/2281 commit on master)

ghost commented 3 years ago

FYI for others in similar situation: Ingresses running v1beta1 (including those currently* supported <=1.21, but are deprecated) are no longer looked for by external-dns due to this backwards incompatible change.

I found this while spending some time troubleshooting an update of external-dns image to 0.10.0, and there was no notices, docs, or comments on the PR that it only supports v1.19+ compatible Ingresses from now on.

So in other words, the release including this change, v0.10.0+, has a MINIMUM required K8s version of v1.19 where Ingresses are used (including having to upgrade Ingresses accordingly to 1.19+ apiVersion) due to a requirement to use networking.k8s.io/v1 (which is only available v1.19+) instead of networking.k8s.io/v1beta1 which is supported <= v1.18 but not served as of v1.22.

larivierec commented 3 years ago

The minimum required k8s version wasn't bumped in docs? It should probably be mentioned somewhere

tim-sendible commented 3 years ago

@tr-mdaize It's actually worse than this, if you want to declaratively create your ingresses, then this change doesn't work for 1.19+ anyway: see #2386