Open ananace opened 9 months ago
@mloiseleur The issue is entirely unrelated to either of those links. The problem here is that the Kubernetes API server munges the Helm-generated patch request (as described in the linked issue), merging the port definitions in the deployment for websecure/websecure-http3 into a single one - which causes either the websecure or websecure-http3 port definition to disappear.
The result on that is the the Service object can't route any HTTPS/HTTP3 traffic anymore, since it ends up referring to a port by a name that doesn't exist on the container definition in the Pod.
This will happen regardless if it's a single service or split service. (But it fails much harder on the single service, as both protocols stop routing in that case)
Edit:
As a side note; due to when/how the issue happens, it'll likely only appear if you perform certain helm upgrades which change the list of ports.
It's possible that using a JSON patch with a kubectl patch
call could also let you re-add the port without having it merged out of existence, but Helm doesn't work that way.
Edit 2:
To try to better explain what's going on, here's an example of how the update fails - here with the TCP port disappearing into the ether after the merge;
$ kubectl --namespace traefik get deploy traefik-2 -o jsonpath='{.spec.template.spec.containers[0].ports}' | jq
[
{
"containerPort": 8000,
"name": "web",
"protocol": "TCP"
},
{
"containerPort": 8443,
"name": "websecure",
"protocol": "TCP"
}
]
$ cat /tmp/example-patch.yaml
spec:
template:
spec:
containers:
- name: traefik-2
ports:
- containerPort: 8000
name: web
protocol: TCP
- containerPort: 8443
name: websecure
protocol: TCP
- containerPort: 8443
name: websecure-http3
protocol: UDP
$ kubectl --namespace traefik patch deploy traefik-2 --patch-file /tmp/example-patch.yaml
deployment.apps/traefik-2 patched
$ kubectl --namespace traefik get deploy traefik-2 -o jsonpath='{.spec.template.spec.containers[0].ports}' | jq
[
{
"containerPort": 8000,
"name": "web",
"protocol": "TCP"
},
{
"containerPort": 8443,
"name": "websecure-http3",
"protocol": "UDP"
}
]
\o I get it.
There are two issues with TCP + UDP on same port , one (already documented) on the Service
and this one on the Deployment
.
The one on the Deployment
will happen only when trying to patch. Aouch.
It seems to be somewhat related to the Kubernetes environment, Helm, and probably configuration as well.
My thought is that just offering the option to render the Service
with numeric target port definitions should work - maybe with a values flag for choosing it.
Ah yeah, this seems likes exactly the kind of problems I encountered while working on #656 and tried to avoid there by not introducing any additional named ports with the same numbers. Basically this only happens when using patch (i.e. helm upgrade
) and not on initial install, which makes it even harder to track, due to the aforementioned issues with the merging behavior of kubectl
.
(There is some kind of choice to be made here, as it is possible to go back to ensuring each number is used in a single pors only in all places, to avoid these kind of bugs. Altough this looks a bit ugly, as the UDP version is not declared, it does work as the container port list declaration is considered to be primarily informational; i.e. if a service on the container listens to a port, it is exposed regardless if it is listed or not. However this is quite confusing to users. However, I do agree this is semantically weird and not preferable.)
As for how to work around this particular issue in practice, I do the following:
when using the --server-side
flag with kubectl
, the patching issue does not occur; therefore you can obtain the "correct" resulting manifest in your cluster by:
helm upgrade
, helm template
, take out the correct template and save it to some local file kubectl apply --server-side -f <file>.yaml
.
This in my experience works fine and should be more or less save; this workaround is only ever needed when you update some kind of port list, any future upgrades that do not touch those should be fine.Also worth mentioning: there is an open proposal for Helm to add Server Side Apply support (see https://github.com/helm/community/pull/312). If that is ever implemented, this issue would be solved by Helm itself.
The helm template
and kubectl apply --server-side
workaround worked for me.
Welcome!
What version of the Traefik's Helm Chart are you using?
26.0.0
What version of Traefik are you using?
2.10.6
What did you do?
Installed the Helm chart with;
This results in a Deployment configured as;
And a service like;
What did you see instead?
This configuration unfortunately seems to run into https://github.com/kubernetes/kubernetes/issues/105610 on
helm upgrade
s, causing the two ports to merge into one - with the first websecure winning and removing the UDP port from existence.The resulting deployment looks like;
And the service ends up as;
This causes the incoming service traffic to fail as there's no
websecure-http3
port to referenceWhat is your environment & configuration?
Bare-metal Kubernetes. Traefik installed from Helm chart using FluxCD.
Additional Information
To keep the deployment working for now, I've added a post-renderer which replaces the service definition for the HTTP3 port to point directly to the port number instead of referring to it by name.
As a note, the template that the Helm Chart renders is correct. The issue is that Kubernetes API server unfortunately messes up the port list when applying upgrades - for at least certain setups, and since the service object the Helm chart renders refers to the ports by name this means that one of the HTTPS/HTTP3 ports ends up unable to be routed to. (Depending on which one gets lost by the API server merge)