Open rainest opened 8 months ago
Kong/gateway-operator-archive#1489 has started the "Use strategicpatch
differently" approach.
Started a slack thread in #sig-api-machinery once again (now with more context): https://kubernetes.slack.com/archives/C0EG7JC6T/p1709303582824119
Notably, you apparently can do SSA client-side, which is a bit of an oxymoron, but:
Alternatives: merging using SSA's logic client-side, that would require you to have the openapi for your schema, SSAing twice
Generate OpenAPI From CRD Definitions: The easiest way is to json.Unmarshal the OpenAPIV3Schema field of the CRD into a kube-openapi spec.Schema. How could I do this SSA client-side: 1.) Create SSA type converter from the spec.Schema in above step: https://github.com/kubernetes/kubernetes/blob/4f3859ce911bd308dbf6f617a8c9e2bdf3a1883b/staging/src/k8s.io/apimachinery/pkg/util/managedfields/typeconverter.go#L36-L47 2.) Apply patch using something like this: https://gist.github.com/alexzielenski/580d76934ca6e1e4b0d128f651d70b95
It's happening for field volumes
too
It's happening for field
volumes
too
@programmer04 can you provide concrete examples where it does not work as expected?
Applying something like that
kind: GatewayConfiguration
apiVersion: gateway-operator.konghq.com/v1beta1
metadata:
name: kong
namespace: default
spec:
dataPlaneOptions:
deployment:
replicas: 2
podTemplateSpec:
spec:
containers:
- name: proxy
image: kong/kong-gateway:3.7
readinessProbe:
initialDelaySeconds: 1
periodSeconds: 1
volumeMounts:
- name: example-cm
mountPath: /opt/kong/files
volumes:
- name: example-cm
configMap:
name: example-cm
controlPlaneOptions:
deployment:
podTemplateSpec:
spec:
containers:
- name: controller
# renovate: datasource=docker versioning=docker
image: kong/kubernetes-ingress-controller:3.2.3
readinessProbe:
initialDelaySeconds: 1
periodSeconds: 1
---
apiVersion: v1
kind: ConfigMap
metadata:
name: example-cm
namespace: default
data:
test.txt: |
Hello World!!!
---
apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
metadata:
name: kong
spec:
controllerName: konghq.com/gateway-operator
parametersRef:
group: gateway-operator.konghq.com
kind: GatewayConfiguration
name: kong
namespace: default
---
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: kong
namespace: default
spec:
gatewayClassName: kong
listeners:
- name: http
protocol: HTTP
port: 80
results in
[manager] 2024-08-12T13:31:14.339Z - ERROR - Reconciler error - {"controller": "dataplane", "controllerGroup": "gateway-operator.konghq.com", "controllerKind": "DataPlane", "DataPlane": {"name":"kong-m22gb","namespace":"default"}, "namespace": "default", "name": "kong-m22gb", "reconcileID": "d6c0120f-0d82-43e5-b73a-c56083155eb9", "error": "could not build Deployment for DataPlane default/kong-m22gb: failed creating Deployment for DataPlane kong-m22gb: Deployment.apps \"dataplane-kong-m22gb-w87ps\" is invalid: [spec.template.spec.volumes[0].configMap: Forbidden: may not specify more than 1 volume type, spec.template.spec.containers[0].volumeMounts[0].name: Not found: \"example-cm\"]"}
because the Deployment that operator tries to submit looks like that
{
"metadata": {
"generateName": "dataplane-kong-m22gb-",
"namespace": "default",
"creationTimestamp": null,
"labels": {
"app": "kong-m22gb",
"gateway-operator.konghq.com/dataplane-deployment-state": "live",
"gateway-operator.konghq.com/managed-by": "dataplane",
"gateway-operator.konghq.com/selector": "02c83d63-658c-4929-808c-6d19a25fa4db",
"konghq.com/gateway-operator": "dataplane"
},
"ownerReferences": [
{
"apiVersion": "gateway-operator.konghq.com/v1beta1",
"kind": "DataPlane",
"name": "kong-m22gb",
"uid": "aaff6e89-e4ea-4380-8c58-cf64c57a5b99",
"controller": true
}
],
"finalizers": [
"gateway-operator.konghq.com/wait-for-owner"
]
},
"spec": {
"replicas": 2,
"selector": {
"matchLabels": {
"app": "kong-m22gb",
"gateway-operator.konghq.com/selector": "02c83d63-658c-4929-808c-6d19a25fa4db"
}
},
"template": {
"metadata": {
"creationTimestamp": null,
"labels": {
"app": "kong-m22gb",
"gateway-operator.konghq.com/selector": "02c83d63-658c-4929-808c-6d19a25fa4db"
}
},
"spec": {
"volumes": [
{
"name": "example-cm",
"secret": {
"secretName": "dataplane-kong-m22gb-2sldc",
"items": [
{
"key": "tls.crt",
"path": "tls.crt"
},
{
"key": "tls.key",
"path": "tls.key"
},
{
"key": "ca.crt",
"path": "ca.crt"
}
]
},
"configMap": {
"name": "example-cm",
"defaultMode": 420
}
},
{
"name": "cluster-certificate",
"secret": {
"secretName": "dataplane-kong-m22gb-2sldc",
"items": [
{
"key": "tls.crt",
"path": "tls.crt"
},
{
"key": "tls.key",
"path": "tls.key"
},
{
"key": "ca.crt",
"path": "ca.crt"
}
]
}
}
],
"containers": [
{
"name": "proxy",
"image": "kong/kong-gateway:3.7",
"ports": [
{
"name": "proxy",
"containerPort": 8000,
"protocol": "TCP"
},
{
"name": "proxy-ssl",
"containerPort": 8443,
"protocol": "TCP"
},
{
"name": "metrics",
"containerPort": 8100,
"protocol": "TCP"
},
{
"name": "admin-ssl",
"containerPort": 8444,
"protocol": "TCP"
}
],
"env": [
{
"name": "KONG_ADMIN_ACCESS_LOG",
"value": "/dev/stdout"
},
{
"name": "KONG_ADMIN_ERROR_LOG",
"value": "/dev/stderr"
},
{
"name": "KONG_ADMIN_GUI_ACCESS_LOG",
"value": "/dev/stdout"
},
{
"name": "KONG_ADMIN_GUI_ERROR_LOG",
"value": "/dev/stderr"
},
{
"name": "KONG_ADMIN_LISTEN",
"value": "0.0.0.0:8444 ssl reuseport backlog=16384"
},
{
"name": "KONG_ADMIN_SSL_CERT",
"value": "/var/cluster-certificate/tls.crt"
},
{
"name": "KONG_ADMIN_SSL_CERT_KEY",
"value": "/var/cluster-certificate/tls.key"
},
{
"name": "KONG_CLUSTER_CERT",
"value": "/var/cluster-certificate/tls.crt"
},
{
"name": "KONG_CLUSTER_CERT_KEY",
"value": "/var/cluster-certificate/tls.key"
},
{
"name": "KONG_CLUSTER_LISTEN",
"value": "off"
},
{
"name": "KONG_DATABASE",
"value": "off"
},
{
"name": "KONG_NGINX_ADMIN_SSL_CLIENT_CERTIFICATE",
"value": "/var/cluster-certificate/ca.crt"
},
{
"name": "KONG_NGINX_ADMIN_SSL_VERIFY_CLIENT",
"value": "on"
},
{
"name": "KONG_NGINX_ADMIN_SSL_VERIFY_DEPTH",
"value": "3"
},
{
"name": "KONG_NGINX_WORKER_PROCESSES",
"value": "2"
},
{
"name": "KONG_PLUGINS",
"value": "bundled"
},
{
"name": "KONG_PORTAL_API_ACCESS_LOG",
"value": "/dev/stdout"
},
{
"name": "KONG_PORTAL_API_ERROR_LOG",
"value": "/dev/stderr"
},
{
"name": "KONG_PORT_MAPS",
"value": "80:8000, 443:8443"
},
{
"name": "KONG_PROXY_ACCESS_LOG",
"value": "/dev/stdout"
},
{
"name": "KONG_PROXY_ERROR_LOG",
"value": "/dev/stderr"
},
{
"name": "KONG_PROXY_LISTEN",
"value": "0.0.0.0:8000 reuseport backlog=16384, 0.0.0.0:8443 http2 ssl reuseport backlog=16384"
},
{
"name": "KONG_STATUS_LISTEN",
"value": "0.0.0.0:8100"
}
],
"resources": {
"limits": {
"cpu": "1",
"memory": "1000Mi"
},
"requests": {
"cpu": "100m",
"memory": "20Mi"
}
},
"volumeMounts": [
{
"name": "example-cm",
"readOnly": true,
"mountPath": "/opt/kong/files"
},
{
"name": "cluster-certificate",
"readOnly": true,
"mountPath": "/var/cluster-certificate"
}
],
"readinessProbe": {
"httpGet": {
"path": "/status/ready",
"port": 8100,
"scheme": "HTTP"
},
"initialDelaySeconds": 5,
"timeoutSeconds": 1,
"periodSeconds": 10,
"successThreshold": 1,
"failureThreshold": 3
},
"lifecycle": {
"preStop": {
"exec": {
"command": [
"/bin/sh",
"-c",
"kong quit"
]
}
}
},
"terminationMessagePath": "/dev/termination-log",
"terminationMessagePolicy": "File",
"imagePullPolicy": "IfNotPresent"
}
],
"restartPolicy": "Always",
"terminationGracePeriodSeconds": 30,
"dnsPolicy": "ClusterFirst",
"securityContext": {},
"schedulerName": "default-scheduler"
}
},
"strategy": {
"type": "RollingUpdate",
"rollingUpdate": {
"maxUnavailable": 0,
"maxSurge": 1
}
},
"revisionHistoryLimit": 10,
"progressDeadlineSeconds": 600
},
"status": {}
}
clobbered merge
...
"spec": {
"volumes": [
{
"name": "example-cm",
"secret": {
"secretName": "dataplane-kong-m22gb-2sldc",
"items": [
{
"key": "tls.crt",
"path": "tls.crt"
},
{
"key": "tls.key",
"path": "tls.key"
},
{
"key": "ca.crt",
"path": "ca.crt"
}
]
},
"configMap": {
"name": "example-cm",
"defaultMode": 420
}
},
...
example-cm
should have its entry in spec.volumes
array - see related upstream issue
I see. I believe #255 could help here but I'd have to double check. That PR has been targeted for 1.4 but no one worked on it since I proposed this (draft) change in May.
I'm happy to talk about it thought.
I skimmed through #255 and it looks reasonable. Definitely we should fix it
Problem
We've found that the PodTemplateSpec merge behavior clobbers generated fields it shouldn't (more detail in Background). We want to only overwrite generated fields if the user patch specifies a value for that particular value.
Possible approaches
Use
strategicpatch
differentlyAs far as we know, other upstream Kubernetes tools use the
strategicpatch
library we currently use without the problems we've observed. We could review their implementation to try and determine what we need to do differently.Previous discussion with API Machinery suggests that using
strategicpatch
in our code may not be viable, though they do not go into details why.Use server-side apply
The API Machinery recommendation is to use server-side apply instead of patching client side. This should avoid the clobbering issues, but presumably requires applying the base resource and patch in separate calls to the Kubernetes API. This in turn presumably means that we'll end up with extra Deployment generations and unnecessary temporary ReplicaSets/Pods. We do not know of a way to tell the upstream controller to wait for a patch before spawning a new generation ReplicaSet.
https://kubernetes.io/blog/2021/08/06/server-side-apply-ga/#using-server-side-apply-in-a-controller discusses controller use of SSA further.
Use JSONPatch
JSONPatch is an alternate patch language. It defines specific actions (e.g. append, replace) for patch components to avoid clobbering, but AFAIK is limited in terms of overriding existing values for a specific key. It is also somewhat harder to write than strategic patches.
Invoke Kustomize
Kustomize supports strategic patches and manages to apply them without clobbering client-side. We could construct the equivalent of a Kustomize directory containing the base (generated) Deployment and a patch config with the user patch, then invoke Kustomize on it.
Having to construct a fake directory in memory and build the supporting metadata is annoying, but appears to be necessary for this approach based on cursory review of the Kustomize client code. There doesn't appear to be an immediately obvious way to just ask it to apply a single patch to a single resource.
Background
We use a wrapper around the API Machinery
strategicpatch
package to handle applying user patches for the ControlPlane Deployment and for the DataPlane Deployment. This feature takes a user-provided PodTemplateSpec and merges it with the controller-generated PodTemplateSpec, with the goal of letting users add extra fields or override specific generated ones, while leaving other generated fields as-is.We've found that these patches do not quite work the same way when applied in our code as they do when applied via kubectl. Merge will sometimes obliterate generated configuration that it should not have. Arrays in particular are apparently merged via index, whereas kubectl will match on a merge key (for example, the
name
field of an EnvVar) and only replace entries if that key matches.Workarounds
On the UX end, including generated values in the patch provides a crude workaround for this, but it's not really suitable outside of tests. Users should not need to retrieve generated values and keep track of changes to them to build their patches.
In code, we can perform our own approximation of the type-aware merge by extracting generated fields before merge and backfilling them in if we don't find an override. This adds a fair degree of complexity and does not scale well, since we need to find every problem field and add a backfill handler for it.