custom iptables version monitor plugin

aojea commented 7 months ago

iptables has two kernel backends, legacy and nft.

Quoting https://developers.redhat.com/blog/2020/08/18/iptables-the-two-variants-and-their-relationship-with-nftables

It is also important to note that while iptables-nft can supplant iptables-legacy, you should never use them simultaneously.

However, we don't want to block the node operations because of this reason, as there is no enough evidence this is causing big issues in the wild, so we just signal and warn about this situation.

Once we have more information we can revisit this decision and keep it as is or move it to permanent.

The plugin runs every day to avoid causing problems on large systems.

https://github.com/kubernetes-sigs/iptables-wrappers/blob/97b01f43a8e8db07840fc4b95e833a37c0d36b12/iptables-wrapper-installer.sh

aojea commented 7 months ago

/assign @thockin @danwinship @SergeyKanzhelev

danwinship commented 7 months ago

hosts and containers using the host can have different iptables versions, these versions are incompatible and can cause problems if both are present in the kernel.

"incompatible" and "can cause problems" are overstating things. As long as your rules are self-contained and aren't trying to interact with or override another component's rules, it generally doesn't matter whether you're using the same iptables mode as anyone else. (Unless the system only has kernel support for one of the two modes, but this PR doesn't help people running into that.)

I think we got a little bit confused about this back when iptables-1.8 first came out, because (a) we were also dealing with various bugs in iptables-1.8.[0-2] at the same time, without fully understand what was going on; (b) at the time, kube-proxy depended on iptables chains that were created by kubelet, so kube-proxy and kubelet specifically needed to be using the same iptables mode; (c) kube-proxy purports to poke holes in the system firewall, which can only work if kube-proxy and the firewall both use the same iptables mode.

The bugs eventually got fixed, KEP-3178 made kube-proxy no longer depend on kubelet's rules, and the "poking holes in the system firewall" use case turns out to be slightly dubious and probably unnecessary. (Some distros have moved to nft-based firewalls now, which kube-proxy's firewall-hole-poking code won't work with even if you're using iptables-nft, and no one has filed any bugs about this.)

So I'm not sure it really makes sense to call this out as a "problem"...

aojea commented 7 months ago

well, at least if there are two systems using the different versions they can override each other rules ... I do not know exactly the problems today, but I remember having segfaults because of mixed versions https://bugzilla.netfilter.org/show_bug.cgi?id=1476

We want this plugin to run just to prevent any possible problems , as is impossible to say that there is zero risk

danwinship commented 7 months ago

segfaults because of mixed versions https://bugzilla.netfilter.org/show_bug.cgi?id=1476

That's not "segfaults because of mixed versions", that's "segfaults because of a bug in the iptables binaries that you happened to run into on a node with mixed versions".

danwinship commented 7 months ago

We want this plugin to run just to prevent any possible problems , as is impossible to say that there is zero risk

It seems to me that this is more likely to generate false positives and warn people that there is a problem when everything is actually working fine.

aojea commented 7 months ago

It seems to me that this is more likely to generate false positives and warn people that there is a problem when everything is actually working fine.

interesting, should I add a threshold to the number of lines? At least for GKE COS clusters I don't expect to have rules from both modes in a node

aojea commented 7 months ago

Based on Dan's feedback and npd docs, I'm going to update this to only report events and metrics to get signal, based on the feedback we can assess the criticality of the problem with real data and update to Condition if necessary

node-problem-detector uses Event and NodeCondition to report problems to apiserver.

NodeCondition: Permanent problem that makes the node unavailable for pods should be reported as NodeCondition. Event: Temporary problem that has limited impact on pod but is informative should be reported as Event.

aojea commented 7 months ago

6: Error fetching NPD metrics: error fetching NPD metrics: {prow 35.188.211.187 curl http://localhost:20257/metrics

curl: (7) Failed to connect to localhost port 20257 after 0 ms: Couldn't connect to server

/test pull-npd-e2e-test

at first sight in does not look related to my PR

danwinship commented 7 months ago

Based on Dan's feedback and npd docs, I'm going to update this to only report events and metrics to get signal, based on the feedback we can assess the criticality of the problem with real data and update to Condition if necessary

OK. I don't really know much about node-problem-detector...

hakman commented 6 months ago

/lgtm /cc @vteratipally @mmiranda96 @MartinForReal

vteratipally commented 6 months ago

/lgtm /approve

k8s-ci-robot commented 6 months ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: aojea, hakman, vteratipally

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/kubernetes/node-problem-detector/blob/master/OWNERS)~~ [vteratipally] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment

kubernetes / node-problem-detector

custom iptables version monitor plugin #844

6: Error fetching NPD metrics: error fetching NPD metrics: {prow 35.188.211.187 curl http://localhost:20257/metrics