Closed pmalek closed 3 months ago
will this be implemented in 1.3 release ?
we recycle pods every week (as we recycle nodes every week) and sometimes we see that the aws karpenter can go aggressive due to lack of pdb and they recommend pdb to avoid too mucch disruption
kubectl get events -n c1b32c25-8557-410c-9ea9-a3c2ca174835
LAST SEEN TYPE REASON OBJECT MESSAGE
13m Normal SuccessfulCreate replicaset/dataplane-7ae39c6e-3999-4b72-be5c-f996c01c2c3b-5qfvv-7bcf544c8f Created pod: dataplane-7ae39c6e-3999-4b72-be5c-f996c01c2c3b-5qfvv-7bcf5pd2p6
11m Normal SuccessfulCreate replicaset/dataplane-7ae39c6e-3999-4b72-be5c-f996c01c2c3b-5qfvv-7bcf544c8f Created pod: dataplane-7ae39c6e-3999-4b72-be5c-f996c01c2c3b-5qfvv-7bcf5kwl49
11m Normal Evicted pod/dataplane-7ae39c6e-3999-4b72-be5c-f996c01c2c3b-5qfvv-7bcf57rbg6 Evicted pod
11m Normal Killing pod/dataplane-7ae39c6e-3999-4b72-be5c-f996c01c2c3b-5qfvv-7bcf57rbg6 Stopping container proxy
9m33s Warning FailedPreStopHook pod/dataplane-7ae39c6e-3999-4b72-be5c-f996c01c2c3b-5qfvv-7bcf57rbg6 PreStopHook failed
11m Normal Scheduled pod/dataplane-7ae39c6e-3999-4b72-be5c-f996c01c2c3b-5qfvv-7bcf5kwl49 Successfully assigned c1b32c25-8557-410c-9ea9-a3c2ca174835/dataplane-7ae39c6e-3999-4b72-be5c-f996c01c2c3b-5qfvv-7bcf5kwl49 to ip-172-31-169-129.eu-west-2.compute.internal
10m Normal Pulling pod/dataplane-7ae39c6e-3999-4b72-be5c-f996c01c2c3b-5qfvv-7bcf5kwl49 Pulling image "kong/kong-gateway:3.5-ubuntu"
10m Normal Pulled pod/dataplane-7ae39c6e-3999-4b72-be5c-f996c01c2c3b-5qfvv-7bcf5kwl49 Successfully pulled image "kong/kong-gateway:3.5-ubuntu" in 5.448527909s (5.448537218s including waiting)
10m Normal Created pod/dataplane-7ae39c6e-3999-4b72-be5c-f996c01c2c3b-5qfvv-7bcf5kwl49 Created container proxy
10m Normal Started pod/dataplane-7ae39c6e-3999-4b72-be5c-f996c01c2c3b-5qfvv-7bcf5kwl49 Started container proxy
13m Normal Scheduled pod/dataplane-7ae39c6e-3999-4b72-be5c-f996c01c2c3b-5qfvv-7bcf5pd2p6 Successfully assigned c1b32c25-8557-410c-9ea9-a3c2ca174835/dataplane-7ae39c6e-3999-4b72-be5c-f996c01c2c3b-5qfvv-7bcf5pd2p6 to ip-172-31-117-74.eu-west-2.compute.internal
13m Normal Pulling pod/dataplane-7ae39c6e-3999-4b72-be5c-f996c01c2c3b-5qfvv-7bcf5pd2p6 Pulling image "kong/kong-gateway:3.5-ubuntu"
13m Normal TaintManagerEviction pod/dataplane-7ae39c6e-3999-4b72-be5c-f996c01c2c3b-5qfvv-7bcf5pd2p6 Cancelling deletion of Pod c1b32c25-8557-410c-9ea9-a3c2ca174835/dataplane-7ae39c6e-3999-4b72-be5c-f996c01c2c3b-5qfvv-7bcf5pd2p6
13m Normal Pulled pod/dataplane-7ae39c6e-3999-4b72-be5c-f996c01c2c3b-5qfvv-7bcf5pd2p6 Successfully pulled image "kong/kong-gateway:3.5-ubuntu" in 5.349953019s (5.349973311s including waiting)
13m Normal Created pod/dataplane-7ae39c6e-3999-4b72-be5c-f996c01c2c3b-5qfvv-7bcf5pd2p6 Created container proxy
13m Normal Started pod/dataplane-7ae39c6e-3999-4b72-be5c-f996c01c2c3b-5qfvv-7bcf5pd2p6 Started container proxy
13m Normal Evicted pod/dataplane-7ae39c6e-3999-4b72-be5c-f996c01c2c3b-5qfvv-7bcf5vhcqp Evicted pod
13m Normal Killing pod/dataplane-7ae39c6e-3999-4b72-be5c-f996c01c2c3b-5qfvv-7bcf5vhcqp Stopping container proxy
11m Warning FailedPreStopHook pod/dataplane-7ae39c6e-3999-4b72-be5c-f996c01c2c3b-5qfvv-7bcf5vhcqp PreStopHook failed
7m37s Normal SuccessfullyReconciled targetgroupbinding/k8s-c1b32c25-dataplan-1e62c79245 Successfully reconciled
7m37s Normal SuccessfullyReconciled targetgroupbinding/k8s-c1b32c25-dataplan-9bdadb86a4 Successfully reconciled
kubectl get events -n 04332fa3-6402-403e-9229-478e806b7d17
LAST SEEN TYPE REASON OBJECT MESSAGE
15m Normal SuccessfulCreate replicaset/dataplane-c495bef5-76b1-4f6d-ac55-0e5b7e5245ec-tmmgc-558696fd57 Created pod: dataplane-c495bef5-76b1-4f6d-ac55-0e5b7e5245ec-tmmgc-55869lcbdb
15m Normal Scheduled pod/dataplane-c495bef5-76b1-4f6d-ac55-0e5b7e5245ec-tmmgc-55869lcbdb Successfully assigned 04332fa3-6402-403e-9229-478e806b7d17/dataplane-c495bef5-76b1-4f6d-ac55-0e5b7e5245ec-tmmgc-55869lcbdb to ip-172-31-66-158.eu-west-2.compute.internal
15m Normal Pulling pod/dataplane-c495bef5-76b1-4f6d-ac55-0e5b7e5245ec-tmmgc-55869lcbdb Pulling image "kong/kong-gateway:3.5-ubuntu"
15m Normal Pulled pod/dataplane-c495bef5-76b1-4f6d-ac55-0e5b7e5245ec-tmmgc-55869lcbdb Successfully pulled image "kong/kong-gateway:3.5-ubuntu" in 6.267026563s (6.267046123s including waiting)
15m Normal Created pod/dataplane-c495bef5-76b1-4f6d-ac55-0e5b7e5245ec-tmmgc-55869lcbdb Created container proxy
15m Normal Started pod/dataplane-c495bef5-76b1-4f6d-ac55-0e5b7e5245ec-tmmgc-55869lcbdb Started container proxy
15m Normal Evicted pod/dataplane-c495bef5-76b1-4f6d-ac55-0e5b7e5245ec-tmmgc-55869smrgh Evicted pod
15m Normal Killing pod/dataplane-c495bef5-76b1-4f6d-ac55-0e5b7e5245ec-tmmgc-55869smrgh Stopping container proxy
14m Warning FailedPreStopHook pod/dataplane-c495bef5-76b1-4f6d-ac55-0e5b7e5245ec-tmmgc-55869smrgh PreStopHook failed
12m Normal SuccessfullyReconciled targetgroupbinding/k8s-04332fa3-dataplan-9019bdead5 Successfully reconciled
12m Normal SuccessfullyReconciled targetgroupbinding/k8s-04332fa3-dataplan-e740979ad0 Successfully reconciled
CloudGateways would ideally have this for April's GA. Hence we either ship this in 1.2 or right after in 1.3.
Slack thread: https://kongstrong.slack.com/archives/C04D2Q757RU/p1707998741598919
@sentinelleader The PR with proposed API changes has been created: https://github.com/Kong/gateway-operator/pull/441.
The implementation will follow after this is merged.
Problem statement
Users might want to specify the allowed disruption budget for their
DataPlane
workloads to configure e.g. how many replicas can be down during an upgrade.Proposed
Support enabling a spec field in
DataPlane
API which will deploy and managedPodDisruptionBudget
targeting theDataPlane
instances.Acceptance criteria
DataPlane
field and expect aPodDisruptionBudget
to be created and managed for me accordingly