containernetworking / cni

Container Network Interface - networking for Linux containers
https://cni.dev
Apache License 2.0
5.49k stars 1.08k forks source link

[Proposal]: Filter plugin #151

Open jessfraz opened 8 years ago

jessfraz commented 8 years ago

Use Case: To be able to define iptables rules for a container network.

Ideally, this plugin would handle the following aspects of configuration for the CNI network:

{
    "name": "mynet",
    "type": "bridge",
    "bridge": "cni0",
    "isGateway": true,
    "ipMasq": true,
    "ipam": {
        "type": "host-local",
        "subnet": "10.22.0.0/16",
        "routes": [
            { "dst": "0.0.0.0/0" }
        ]
    },
    “filter”: [
    {
            “dst”: “10.10.10.100/32”,
            “src”: “0.0.0.0/0”,
            “dport”: 80,
            "connstate": "new",
            "index": 20,
            "target": "ALLOW"
    },
    {
        "dst": "0.0.0.0/0",
        "src": "172.64.0.0/13",
        "index": 10,
        "target": "DROP"
    }
    ]
}
jessfraz commented 8 years ago

ping @brianredbeard

brianredbeard commented 8 years ago

+1

I think that this would allow for a lot of novel configurations. Additionally this removes the need for certain operations requiring CAP_NET_ADMIN from the realm of container execution and properly puts it in the hands of a host administrator.

brianredbeard commented 8 years ago

Affects #138

jessfraz commented 8 years ago

Also would need to delegate ipv4 or ipv6.

brianredbeard commented 8 years ago

Possibly the following additions:

Likely the originally proposed "Target" should changed to "Chain" to more closely align with the configuration terminology from iptables.

For clarification, the combination of both chain and table can be used to create both filter rules and relevant redirect rules at (seemingly) almost any stage of the packet routing through the kernel[1] with the specified configuration options. While this will allow for maximum usability, likely sane defaults should still be adopted focused around the use case of simple filtering as this is likely the most needed case.

[1]

                               XXXXXXXXXXXXXXXXXX
                             XXX     Network    XXX
                               XXXXXXXXXXXXXXXXXX
                                       +
                                       |
                                       v
 +-------------+              +------------------+
 |table: filter| <---+        | table: nat       |
 |chain: INPUT |     |        | chain: PREROUTING|
 +-----+-------+     |        +--------+---------+
       |             |                 |
       v             |                 v
 [local process]     |           ****************          +--------------+
       |             +---------+ Routing decision +------> |table: filter |
       v                         ****************          |chain: FORWARD|
****************                                           +------+-------+
Routing decision                                                  |
****************                                                  |
       |                                                          |
       v                        ****************                  |
+-------------+       +------>  Routing decision  <---------------+
|table: nat   |       |         ****************
|chain: OUTPUT|       |               +
+-----+-------+       |               |
      |               |               v
      v               |      +-------------------+
+--------------+      |      | table: nat        |
|table: filter | +----+      | chain: POSTROUTING|
|chain: OUTPUT |             +--------+----------+
+--------------+                      |
                                      v
                               XXXXXXXXXXXXXXXXXX
                             XXX    Network     XXX
                               XXXXXXXXXXXXXXXXXX
jessfraz commented 8 years ago

+1, and cool ascii chart :)

On Wed, Mar 9, 2016 at 3:38 PM, brian notifications@github.com wrote:

Possibly the following additions:

  • Table (optional, defaults to FILTER)

Likely the originally proposed "Target" should changed to "Chain" to more closely align with the configuration terminology from iptables.

For clarification, the combination of both chain and table can be used to create both filter rules and relevant redirect rules at (seemingly) almost any stage of the packet routing through the kernel[1] with the specified configuration options. While this will allow for maximum usability, likely sane defaults should still be adopted focused around the use case of simple filtering as this is likely the most needed case.

[1]

                           XXXXXXXXXXXXXXXXXX
                         XXX     Network    XXX
                           XXXXXXXXXXXXXXXXXX
                                   +
                                   |
                                   v
+-------------+ +------------------+ table: filter <---+ table: nat chain: INPUT chain: PREROUTING +-----+-------+ +--------+---------+
v v
[local process] **** +--------------+
+---------+ Routing decision +------> table: filter
v **** chain: FORWARD
**** +------+-------+ Routing decision ****
v ****

+-------------+ +------> Routing decision <---------------+ |table: nat | | **** |chain: OUTPUT| | + +-----+-------+ | | | | v v | +-------------------+ +--------------+ | | table: nat | |table: filter | +----+ | chain: POSTROUTING| |chain: OUTPUT | +--------+----------+ +--------------+ | v XXXXXXXXXXXXXXXXXX XXX Network XXX XXXXXXXXXXXXXXXXXX

— Reply to this email directly or view it on GitHub https://github.com/appc/cni/issues/151#issuecomment-194568636.

Jessie Frazelle 4096R / D4C4 DD60 0D66 F65A 8EFC 511E 18F3 685C 0022 BFF3 pgp.mit.edu http://pgp.mit.edu/pks/lookup?op=get&search=0x18F3685C0022BFF3

squaremo commented 8 years ago

This looks like something that's already been discussed a fair bit. May we know what impels it? Context-establishing links gratefully welcomed :)

philips commented 8 years ago

@squaremo This is part of the context, I think: https://groups.google.com/forum/#!topic/cni-dev/FW3BCFJwAxY

steveej commented 8 years ago

@jfrazelle first, thanks for this detailed proposal!

As I understand this, your design would require to add a generic filter field to the spec, and we would need to figure out what a filter-type plugin is, which attributes to keep generic to all filter plugins and declare the others specific to your proposed iptables plugin. In addition, main/meta plugin would have to implement filter specific code, because as of now we do not have a proper mechanism of chaining plugins and adding global hooks/functionality.

In https://github.com/appc/cni/pull/138#issuecomment-197335704 it was already pointed out that adding a new generic attribute does not scale well, because every plugin needs to add code and support for this. As an alternative to this it would be easier to integrate if the functionality was a plugin on its own, and could be run against a current network namespace.

Could you see this approach working here too?

tomdee commented 8 years ago

I second @steveeJ's point about this being implemented as a separate plugin. But I don't really understand the use case - is this just a way of encoding iptables rules in CNI config? Why not just run those rules directly?

This is also going to present problems with some of the backends - AFAIK iptables doesn't work with ipvlan,

brianredbeard commented 8 years ago

@tomdee / @squaremo I'll also leave this open for additionally commentary by @jfrazelle, but yes, this is a way of encoding iptables rules in a CNI configuration. One motivator of this is the ability to predictably specify limitations on a network outside of a particular container (without requiring the container to have CAP_NET_ADMIN for it's namespace). In another case, I want to be able to specify the exact networks, their filters, and topologies at the time of host provisioning. Saying "Why not run those rules directly?" is like saying "Why not just run a bash script to configure all of the bridges, bonds, VLANs, etc for a host so that the containers can use them?" There's nothing in the bridge plugin (that I'm aware of) which couldn't be accomplished with sufficient bash-fu... that being said it didn't slow down @squaremo or @eyakubovich from working on it. Let's walk through an example:

I have a container that I want to block all egress traffic from unless it goes through a SOCKS5 proxy. Today this would require manual intervention by an administrator after the definition of a network in /etc/cni/net.d/malicious_network.conf to operate on the filter rules within that network namespace. It is desirable that these rules be defined as a part of the network so as to make the definition portable across Linux hosts.

In another example, let's say I'm defining a network (named: protected) for my LDAP service. I only want to allow inbound traffic from a defined DMZ network (named: dmz) where I will run a bastion process which will proxy access. In this case, defining both the dmz & protected networks as well as their filtering relationship would be ideal and simplify the deployment of this service.

lxpollitt commented 8 years ago

@brianredbeard: I'm not sure I'm grokking how your example uses cases would map to the proposed filter plug-in. Is the intention that the filter plugin provides the networks (e.g. dmz & protected) itself? Or does the filter plug-in chain on to another network plugin that is providing the networks?

(@jfrazelle: It would be good to chat to you about how we approach this kind of filtering in the Calico CNI plugin sometime - but this PR is probably not the best forum for that!)

squaremo commented 8 years ago

I'm not sure I'm grokking how your example uses cases would map to the proposed filter plug-in. Is the intention that the filter plugin provides the networks (e.g. dmz & protected) itself?

I think the idea would be to compose a filter plugin "network" with other configuration that provides the actual network. (This composition is possible today just by adding more than one "network" to a namespace, but it could be made more convenient. See #147.)

Additionally this removes the need for certain operations requiring CAP_NET_ADMIN from the realm of container execution and properly puts it in the hands of a host administrator.

I like this point, and I think it strongly supports the case for a filter plugin.

lxpollitt commented 8 years ago

I would love to see CNI supporting networking policy (at a minimum filtering, but other policy too in time) in a way that is implementation neutral. I would also like to see it being dynamic and not just at container creation time. I am not sure that defining a new plug-in type to manipulate iptables achieves either of those goals. I'm struggling to see how it wouldn't introduce significant assumptions / dependencies on the underlying networking plugin implementation, which I think is a bad thing.

(I think that one of the key reasons why CNI is gaining such good traction with the networking community is because it makes very few assumptions about the underlying network solution implementation. This has allowed a broad range of network solution vendors to map to CNI relatively easily. Orchestration systems such as Kubernetes and Mesos are adopting CNI because it supports this broad range of network solutions. I personally would advocate that any new CNI plug-in types follow a similar philosophy as much as is practically possible.)

philips commented 8 years ago

I agree with @lxpollitt. For now I want to focus on CNI being awesome at setting up connectivity for a container and tearing that connectivity down when it dies.

If we are going to start doing policy we need to get a few folks to put a proposal together to define the goals and non-goals. For example, in nearly every non-trivial use case a container is going to need dynamic egress and ingress rules as its neighbors (load balancers, dependent services) die and schedule. And I don't think the current CNI plugin model is ideal for this sort of dynamic configurations.

salv-orlando commented 8 years ago

I am also interested in seeing how the CNI interface will allow for shaping container ACLs. @lxpollitt and @philips correctly pointed out their dynamic nature - unless the plan is to have a static set of ACLs for the whole container's lifetime.

Unlike the IPAM plugin, which is only invoked from the "main" plugin, in this case could there be a case for a hook for doing "filter refresh" - maybe CNI_FILTER_REFRESH? The container runtime might invoke it whenever there's a change in the filter rules that need to be applied to the container interface.

steveej commented 8 years ago

@salv-orlando

could there be a case for a hook for doing "filter refresh" - maybe CNI_FILTER_REFRESH? The container runtime might invoke it whenever there's a change in the filter rules that need to be applied to the container interface.

This is one of the use-cases for the UPDATE command we've been discussing in #89.

steveej commented 8 years ago

@salv-orlando

could there be a case for a hook for doing "filter refresh" - maybe CNI_FILTER_REFRESH? The container runtime might invoke it whenever there's a change in the filter rules that need to be applied to the container interface.

This is one of the use-cases for the UPDATE command we've been discussing in #89.

eyakubovich commented 8 years ago

The point of having CNI is to allow for implementation choice of an abstract action: to be able to choose how to connect to a particular network (network being a set of endpoints being able to talk to each other). A CNI plugin is not meant to run some random piece of code, even if it is networking related.

The OP proposes to add machinery to install iptable rules. I agree that this might be useful for the operator to do. But there are no multiple ways of doing this -- it's a very specific action, not an abstract one. It's the same as "tuning" plugin (https://github.com/containernetworking/cni/blob/master/plugins/meta/tuning/tuning.go) that made its way in. There's no doubt that it's useful to set sysctls prior to container being launched and a container runtime/orchestrator can implement some generic customization hook mechanism for that purpose. But I would argue that CNI is not it.

Another issue is that there is probably useful container networking code that can be useful to reuse across multiple container projects. If that is the case, it should be factored out into libraries and submitted under github.com/containernetworking org.

philips commented 8 years ago

+1, I don't think we should be doing firewall configuration here. And I agree the tuning plugin seems strange too. I think this issue should be closed.

On Wed, Aug 3, 2016 at 1:29 PM Eugene Yakubovich notifications@github.com wrote:

The point of having CNI is to allow for implementation choice of an abstract action: to be able to choose how to connect to a particular network (network being a set of endpoints being able to talk to each other). A CNI plugin is not meant to run some random piece of code, even if it is networking related.

The OP proposes to add machinery to install iptable rules. I agree that this might be useful for the operator to do. But there are no multiple ways of doing this -- it's a very specific action, not an abstract one. It's the same as "tuning" plugin ( https://github.com/containernetworking/cni/blob/master/plugins/meta/tuning/tuning.go) that made its way in. There's no doubt that it's useful to set sysctls prior to container being launched and a container runtime/orchestrator can implement some generic customization hook mechanism for that purpose. But I would argue that CNI is not it.

Another issue is that there is probably useful container networking code that can be useful to reuse across multiple container projects. If that is the case, it should be factored out into libraries and submitted under github.com/containernetworking org.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/containernetworking/cni/issues/151#issuecomment-237362660, or mute the thread https://github.com/notifications/unsubscribe-auth/AACDCBAYwXDBKSIjATD0yAHZT3QBE2eiks5qcPolgaJpZM4HtPQN .