aws / aws-application-networking-k8s

A Kubernetes controller for Amazon VPC Lattice
https://www.gateway-api-controller.eks.aws.dev/
Apache License 2.0
174 stars 50 forks source link

Update gateway api CRD versions? #638

Open vd-arnaud opened 6 months ago

vd-arnaud commented 6 months ago

The current version (v1.0.5) of the gateway-api-controller chart comes with crds that are quite old:

Such CRDs are from this release which is more than one year old

The last release from kubernetes-sigs includes:

It leads to issues because other actors in this ecosystem uses new CRDs version, for example the last version of external-dns is using HTTPRoute v1. So one had to update this particular CRD to be able to use gateway-api-controller AND external-dns. Hopefully the last HTTPRoute CRD still includes v1beta1, but for how long?

It would be great if you plan to update CRDs in future release 🙏

zijun726911 commented 6 months ago

If you install the v1 gateway API CRDs in your cluster by: https://github.com/aws/aws-application-networking-k8s/blob/1862bef9b5f4956b08f34b80723464a99682f542/docs/contributing/developer.md?plain=1#L44-L47 and run the v1.0.5 controller and create v1 Gateway, v1 HTTPRoute, what it happen? In the e2e test code we actually already used the V1 gateway api resource, for example: https://github.com/aws/aws-application-networking-k8s/blob/e85369a9808835f4eab31c88edb9bcc920870e2c/test/suites/integration/httproute_path_match_test.go#L29 and it can work for us.

But your suggestion really make sense, we need to install the v1 CRDs by default in the helm chart and use v1 CRDs in the controller code.

vd-arnaud commented 6 months ago

I did the test to install last release from kubernetes-sigs then to install gateway-api-controller v1.0.5 and I got some error in aws-gateway-controller-chart pods so I didn't go further.

Here is a sample of errors I've got:

{"level":"error","ts":"2024-05-16T17:13:48.717Z","logger":"runtime.controller-runtime.source.EventHandler","caller":"source/kind.go:68","msg":"failed to get informer from cache","error":"failed to get API group resources: unable to retrieve the complete list of server APIs: gateway.networking.k8s.io/v1alpha2: the server could not find the requested resource"}

which makes sense: gateway-api-controller tries to fetch gateway using v1alpha2 which is not available anymore with new CRDs. I "fixed" this error by downgrading Gateway CRD and I got similar error with GatewayClass

Thanks for your interest in this, please tell me if I can help 👍

vd-arnaud commented 6 months ago

Hello, small update from our side, we are now using DNSEndpoint instead of HTTPRoute as a source for our external-dns configuration and it kind of solved the issue.

I still think it would be nice to have updated CRDs to prevent some similar issues in the future (the possibility to have differents piece of software using those CRDs is not null at all!)

DingGGu commented 4 months ago

Any updates?

We're getting trouble with using other Gateway Controller such as istio.

seongpil0948 commented 4 months ago

Any updates?

We're getting trouble with using other Gateway Controller such as istio.

same issue

erikfuller commented 3 months ago

@DingGGu or @seongpil0948 can you possibly share the error you're seeing and a little more about your setup?

I did some testing a while back to try to better understand how version mismatches were handled, using different configurations (alpha installed, but apply YAMLs with v1, and vice versa) and it all seemed to "just work". From what I could tell, kubernetes plumbing takes care of translating the requested version to whatever the local process knows.

The only issue I noticed was that the "mock" kubernetes client (mock_client.NewMockClient(c)) only handles explicit API versions, so unit tests don't translate objects across versions like the actual API does.

erikfuller commented 3 months ago

What would really help are steps to repro this, if you have them. Agree that it would be ideal to move to latest versions, just need to ensure the upgrade path is smooth.

DingGGu commented 3 months ago

Hi @erikfuller!

  1. Install GatewayAPI CRD 1.1.0, such as before using k8s Gateway with istio https://istio.io/latest/docs/tasks/traffic-management/ingress/gateway-api/#setup
kubectl kustomize "github.com/kubernetes-sigs/gateway-api/config/crd?ref=v1.1.0" | kubectl apply -f -
  1. Install Lattice controller via Helm
    helm install -n kube-system gateway-api-controller aws-gateway-controller-chart \
    -f values.yaml \
    --version "1.0.6" 

Error logs in controller:

{"level":"error","ts":"2024-07-29T00:52:28.215Z","logger":"runtime.controller-runtime.source.EventHandler","caller":"source/kind.go:63","msg":"if kind is a CRD, it should be installed before calling Start","kind":"GRPCRoute.gateway.networking.k8s.io","error":"no matches for kind \"GRPCRoute\" in version \"gateway.networking.k8s.io/v1alpha2\""}

However, the cluster has GRPCRoute v1.

$ kubectl api-resources | grep grpc
grpcroutes                                                             gateway.networking.k8s.io/v1              true         GRPCRoute
erikfuller commented 1 month ago

Thanks, @DingGGu that's super helpful. I'm going to be looking further into this one and hope to share some updates in the next week or so.

erikfuller commented 1 month ago

I was able to get a local repro going. It looks like v1 for GRPCRoute has served:false on the v1alpha2 version, which is why it isn't automatically translated by the Kubernetes API. Will look at options to resolve this.

savealive commented 1 month ago

Why for the gods sake this controller distributes other projects CRDs at all? Neither External DNS nor Kubernetes Gateway API have any relation to this project. Just don't install GW API CRDs please!