Open ellistarn opened 1 year ago
This may be due to #105 . I'm working on the PR now to add the finalizer for GatewayClass
Today, we expect amazonp-vpc-lattice GatewayClass
is an object managed by Infrastructure provider and kind of like the capability of a EKS cluster. We are not
expecting a cluster operator to be able to delete it.
Especially, we want to prevent it from getting deleted
(#105) and all the VPC lattice resource get accidently wiped outs.
Hi @liwenwu-amazon -- I'm not sure I understand how we would prevent a resource from being deleted. And as I mentioned in #105 , the community is likely not moving forward with finalizers for GatewayClass
so we would want a mechanism to prevent system hang in case someone deletes the GatewayClass
.
One alternative is to have the controller re-create the GatewayClass if deleted?
Not sure what the semantic meaning of deleting amazon-vpc-lattice
gatewayclass? Does it mean the EKS cluster no longer
support VPC lattice?
I am hoping in long-run, the lattice controller will be an automatic EKS-cluster add-on. And EKS-cluster always support
AWS VPC lattice. AWS EKS infrastructure automatically create amazon-vpc-lattice
gateway class
The kubernetes pattern is typically to allow arbitrary resource application and deletion, and handle these issues at runtime. It's common for use cases to install or uninstall yaml in any order, and validating the existence of interdependent resources will break many flows (e.g. helm, flux). This is in strong contrast to AWS APIs, which are highly interdependent.
As linked by @aaroniscode, the upstream spec https://gateway-api.sigs.k8s.io/references/spec/#gateway.networking.k8s.io%2fv1beta1.GatewayClass states that the gateway implementation must put a finalizer on the gateway, to govern deletion of the gateway class. It's not clear to me how this is supposed to work in practice. e.g., what happens if the user uninstalls the gateway implementation? Who is responsible for removing the finalizer?
Maybe we can reach out to sig-networking to get an authoritative answer from the spec designers.
edit: just realized @aaroniscode already did this here: https://github.com/aws/aws-application-networking-k8s/issues/105#issuecomment-1455128641
Perhaps we should bundle the gateway class as part of the helm install, and never require a user to create one at runtime.
Discussed w/ @liwenwu-amazon. As a path forward, we will include the gateway class as part of the installation. Users will not need to create this manually. @aaroniscode, thoughts?
Sounds good to me @ellistarn . I think other projects have taken this approach as well. The question remains what to do if the GatewayClass
is deleted. I mentioned here that the community is likely not going forward with a finalizer and no other projects appear to be implementing one.
If that's the case, to avoid a hung system with the GatewayClass
deleted, I see two paths:
GatewayClass
and puts it back if deletedHTTPRoute
and Gateway
in scope if they have the VPC Lattice finalizer on them and they are deleted.What do you think?
Is a workaround known to remove the GatewayClass?
Did you meet any issue or get stuck when delete the GatewayClass? GatewayClass don't have any finalizer. it should be deleted without a hitch
In my case it gets stuck when deleting the GatewayClass. even after running "microk8s reset"
$kubectl get gatewayclasses.gateway.networking.k8s.io
NAME CONTROLLER ACCEPTED AGE
eg gateway.envoyproxy.io/gatewayclass-controller True 91m
it still exists
it survies a reset and reboot.
with
$sudo microk8s kubectl describe gatewayclasses
Name: eg
Namespace:
Labels: <none>
Annotations: <none>
API Version: gateway.networking.k8s.io/v1
Kind: GatewayClass
Metadata:
Creation Timestamp: 2024-04-15T23:38:42Z
Deletion Grace Period Seconds: 0
Deletion Timestamp: 2024-04-15T23:52:09Z
Finalizers:
gateway-exists-finalizer.gateway.networking.k8s.io
Generation: 2
Resource Version: 221422
UID: 75afcac3-ce42-4f86-bbd8-39df6dcb8b8d
Spec:
Controller Name: gateway.envoyproxy.io/gatewayclass-controller
Status:
Conditions:
Last Transition Time: 2024-04-15T23:38:42Z
Message: Valid GatewayClass
Observed Generation: 1
Reason: Accepted
Status: True
Type: Accepted
Events: <none>
and
$sudo microk8s kubectl delete gatewayclass eg
gatewayclass.gateway.networking.k8s.io "eg" deleted
[HANGS HERE]
This project aws-application-networking-k8s
(aws-gateway-api-controller
) only manages GatewayGlass withcontrollerName: application-networking.k8s.aws/gateway-api-controller
.
But your GatewayGlass has Controller Name: gateway.envoyproxy.io/gatewayclass-controller
, which is not managed by the aws-gateway-api-controller
. You could search answer and ask in the envoy gateway repo :
https://github.com/envoyproxy/gateway
https://gateway.envoyproxy.io/
But from my limited knowledge in k8s, you probably could try to do kubectl edit gatewayclass eg
and delete these lines and save it:
Finalizers:
gateway-exists-finalizer.gateway.networking.k8s.io
And then try kubectl delete gatewayclass eg
again to hard delete this gatewayclass. (I am not sure what the consequence of hard delete you should search answer in the https://github.com/envoyproxy/gateway )
Reproduction steps:
This log line looks like the culprit.