containernetworking / plugins

Some reference and example networking plugins, maintained by the CNI team.
Apache License 2.0
2.17k stars 778 forks source link

nftables support for ipmasq and portmap #935

Open danwinship opened 1 year ago

danwinship commented 1 year ago

This adds nftables backends to the two remaining iptables-only bits in containernetworking/plugins; ipmasq (used by ptp and bridge) and portmap. For now, they are both set up to use iptables by default, unless you explicitly request nftables via the network config, or you run the plugin on a host that has nft installed but not iptables (eg, future RHEL 10 hosts).

The firewall plugin currently has "iptables" and "firewalld" backends. In theory it could have a separate explicit "nftables" backend, but I didn't do that here.

The ipmasq and portmap code implement nftables via https://github.com/danwinship/nftables, which is a library I started for the planned kube-proxy nftables backend, and also plan to use for various other nftables ports (eg, cri-o). It might move from danwinship/ to somewhere more generic at some point, but it might not.

After I started working on this branch, I discovered that the bridge plugin was actually already using nftables via pkg/link/spoofcheck.go, using the https://github.com/networkplumbing/go-nft library. Using two separate nftables libraries is probably not great, so we may want to fix that eventually. The libraries have somewhat similar APIs up to a point, then diverge completely because go-nft uses the JSON API to /sbin/nft and therefore has to use the (poorly-documented) JSON rule schema, while danwinship/nftables uses the plain text API to /sbin/nft and so uses "normal" nft rules. eg, compare:

func (sc *SpoofChecker) matchIfaceJumpToChainRule(chain, toChain string) *schema.Rule {
        return &schema.Rule{
                Family: schema.FamilyBridge,
                Table:  natTableName,
                Chain:  chain,
                Expr: []schema.Statement{
                        {Match: &schema.Match{
                                Op:    schema.OperEQ,
                                Left:  schema.Expression{RowData: []byte(`{"meta":{"key":"iifname"}}`)
},
                                Right: schema.Expression{String: &sc.iface},
                        }},
                        {Verdict: schema.Verdict{Jump: &schema.ToTarget{Target: toChain}}},
                },
                Comment: ruleComment(sc.refID),
        }
}

which in danwinship/nftables syntax would become:

func (sc *SpoofChecker) matchIfaceJumpToChainRule(chain, toChain string) *nftables.Rule {
        return &nftables.Rule{
                Chain: chain,
                Rule: nftables.Concat(
                        "iifname", sc.iface,
                        "jump", toChain,
                ),
                Comment: ruleComment(sc.refID),
        }
}

("Draft" PR because not fully-tested yet...)

danwinship commented 1 year ago
  [FAILED] Unexpected error:
      <*errors.errorString | 0xc0003bdb20>: 
      running [/usr/sbin/iptables -t nat -D POSTROUTING -s 10.1.2.2 -j CNI-43a5a67926c1a665ff4c21b7 -m comment --comment name: "testConfig" id: "dummy-0" --wait]: exit status 2: iptables v1.8.7 (nf_tables): Chain 'CNI-43a5a67926c1a665ff4c21b7' does not exist

sigh, this is what https://github.com/coreos/go-iptables/pull/107 was supposed to fix but I guess I fixed it for iptables-legacy but not iptables-nft?

(Was this not failing before? I'm not sure why this change would make the race condition suddenly start happening.)

aojea commented 3 months ago

linter fail, I see is still in draft, let me know if you want a review,

danwinship commented 1 month ago

OK, finally updated and tested:

(I guess this will need a docs update to the cni.dev repo?)

danwinship commented 1 month ago

/assign @squeed

danwinship commented 1 month ago

oh, and I could port this to use the same nftables library that spoofcheck uses, if you preferred.

or port the spoofcheck code to use knftables like kube-proxy (and Calico and soon some other stuff)

danwinship commented 2 weeks ago

or port the spoofcheck code to use knftables like kube-proxy

https://github.com/danwinship/plugins/commit/17bb5355

squeed commented 1 week ago

Looks basically good. I'd like to see GC support if possible, since it informs the design the API. Could you add that, even if the top-end API glue isn't there yet.