Closed M4t7e closed 1 month ago
Thanks for your contribution :heart:
commitlint has detected that all commit messages in this PR follow the conventional commit format :tada:
Hi @M4t7e, thank you very much for this PR!
Could you please add a DCO to your commit? (https://github.com/hcloud-talos/terraform-hcloud-talos/pull/34/checks?check_run_id=26549766936)
And maybe we should use best-effort
for loadBalancer.acceleration
?
acceleration is the option to accelerate service handling via XDP Applicable values can be: disabled (do not use XDP), native (XDP BPF program is run directly out of the networking driverβs early receive path), or best-effort (use native mode XDP acceleration on devices that support it).
I played around with it a bit today. Unfortunatelybpf.masquerade=true
caused that coredns could no longer connect to external DNS servers... i/o timout. Do you have any experience with this?
Hi @mrclrchtr, thanks for the review! :slightly_smiling_face:
And maybe we should use
best-effort
forloadBalancer.acceleration
?
XDP should always be supported by Talos and Hetzner VMs but it would also not hurt to use best-effort
instead. If you prefer that, I can change it.
I played around with it a bit today. Unfortunatelybpf.masquerade=true caused that coredns could no longer connect to external DNS servers... i/o timout. Do you have any experience with this?
Good catch! It seems the Talos forwardKubeDNSToHost
feature does not work together with Cilium BPF based routing (enabled by bpf.masquerade=true
). See: https://github.com/siderolabs/talos/issues/8836
I would rather deactivate forwardKubeDNSToHost
as it brings just very limited benefits and on the other hand side the BPF based routing is the no 1 selling point and performance boost of Cilium.
Could you please add a DCO to your commit? (https://github.com/hcloud-talos/terraform-hcloud-talos/pull/34/checks?check_run_id=26549766936)
I'm not very into that DCO stuff as I see it a bit redundant and tbh annoying here on Github. I would also not put any real name or working mail address into it. Additionally many projects interpret it differently and I have seen no definition in this project. From legal/license perspective it means nothing than there is a sign-off in the commit message, whatever that means.
XDP should always be supported by Talos and Hetzner VMs but it would also not hurt to use
best-effort
instead. If you prefer that, I can change it.
Yes, you are right, we can leave it as is.
Good catch! It seems the Talos
forwardKubeDNSToHost
feature does not work together with Cilium BPF based routing (enabled bybpf.masquerade=true
). See: siderolabs/talos#8836 I would rather deactivateforwardKubeDNSToHost
as it brings just very limited benefits and on the other hand side the BPF based routing is the no 1 selling point and performance boost of Cilium.
Hmm.. yesterday I also disabled forwardKubeDNSToHost
because I already thought that was the reason. But then unfortunately there were also i/o timeouts in the direction of the Hetzner DNS server. I could see from the IPs that these were requested. I then even switched to cloudflare DNS server. There were also i/o timeouts.
Maybe I have to try again and restart coredns.
I'm not very into that DCO stuff as I see it a bit redundant and tbh annoying here on Github. I would also not put any real name or working mail address into it. Additionally many projects interpret it differently and I have seen no definition in this project. From legal/license perspective it means nothing than there is a sign-off in the commit message, whatever that means.
Ok, I think you are right. Maybe I should disable the check.
Hmm.. yesterday I also disabled
forwardKubeDNSToHost
because I already thought that was the reason. But then unfortunately there were also i/o timeouts in the direction of the Hetzner DNS server. I could see from the IPs that these were requested. I then even switched to cloudflare DNS server. There were also i/o timeouts.
How did you find the timeouts? I tried to reproduce them, but so far everything looks fine. By the way, are you using a K8s version compatible with Cilium? I think the current default with Talos 1.7 is K8s 1.30, and Cilium 1.15 is not compatible with K8s 1.30. They will add support in the upcoming 1.16 release.
How did you find the timeouts? I tried to reproduce them, but so far everything looks fine.
I had timeouts error logs directly in coreDNS.
By the way, are you using a K8s version compatible with Cilium? I think the current default with Talos 1.7 is K8s 1.30, and Cilium 1.15 is not compatible with K8s 1.30. They will add support in the upcoming 1.16 release.
That's a good point! I wasn't aware of that. Yes, I have an incompatible version... I will test the RC.
I definitely want to merge this PR, but I would like to test further to get more certainty that everything is working.
With cilium 1.16 and k8s 1.30 everything looks good to me. forwardKubeDNSToHost
seems to work,too. @M4t7e do you have any complaints? Otherwise we can merge.
This configuration has been running for 2 weeks without any problems. That's why I think it looks good. Thanks again!
:tada: This PR is included in version 2.10.0 :tada:
The release is available on GitHub release
Your semantic-release bot :package::rocket:
There were problems again... unfortunately I can't figure out why and how to solve them... I revert to masquerade: false
What problems did you observe @mrclrchtr? I tried a similar configuration and can still see the timeout logs in coreDNS.
Yes I observed the timeout logs π
This PR adds a few Cilium features to improve the network performance and security:
Before:
After:
Info:
endpointRoutes
is excluded here due to https://github.com/cilium/cilium/issues/28812. Additionally, this configuration will automatically become the default in one of the upcoming releases of Cilium (https://github.com/cilium/cilium/issues/14955).Force Cilium to apply changes:
kubectl -n kube-system rollout restart ds/cilium