deis / controller

Deis Workflow Controller (API)
https://deis.com
MIT License
41 stars 53 forks source link

Container network segregation / firewalling #1216

Open deis-admin opened 7 years ago

deis-admin commented 7 years ago

From @carmstrong on May 16, 2014 1:8

Application containers don't need access to etcd, for example.

Copied from original issue: deis/deis#986

deis-admin commented 7 years ago

From @gabrtv on May 16, 2014 20:5

As it stands today, deployed containers can access anything on the network including unrestricted services running on the CoreOS host (like etcd). To fix this, we need to deploy a unit file that configures iptables on all Deis machines.

Here is a first stab at network security requirements for containers:

Deis containers are a special case as they require access to etcd. How we accomplish this is an interesting problem. We may be able to launch a sidekick proxy or leverage some more advanced Docker networking.

I see 4 steps:

  1. Think through attack vectors from malicious containers and agree on the requirements
  2. Prototype an iptables unit that locks down traffic according to those requirements
  3. Prototype a mechanism for allowing certain containers restricted access to etcd (proxy?)
  4. Deploy the new iptables unit file across all Deis hosts using custom user-data

Thoughts?

deis-admin commented 7 years ago

From @moretea on July 10, 2014 15:38

What starting with binding critical services, such as Docker and etcd to $COREOS_PRIVATE_IPV4?

deis-admin commented 7 years ago

From @bacongobbler on July 11, 2014 8:49

@moretea could you please expand on that a bit? I'm not quite sure I understand the question.

deis-admin commented 7 years ago

From @gust1n on July 11, 2014 10:47

We heavily use etcd for watching the /services path for discovering new nodes and connecting over zeroMQ (not using your router). Restricting access to this would make us poll the DNS router instead which would make life harder for us...

deis-admin commented 7 years ago

From @benmccann on July 16, 2014 14:44

Wouldn't some applications want access to etcd for their own use? It seems like use of etcd authentication and ACLs may be a solution which allows applications access to etcd without being able to access Deis's internally used etcd keys.

deis-admin commented 7 years ago

From @carmstrong on July 22, 2014 17:39

As far as I know, there is no way to support keyspacing in etcd. They support client SSL certs, but no way to restrict keys based on different clients. ACL support is tracked in coreos/etcd#91, but there haven't been any updates in quite some time...

If anyone has a better way to handle this before etcd implements ACL support, we're definitely interested!

deis-admin commented 7 years ago

From @paulczar on August 10, 2014 21:57

most important step here is to bind etcd to private network only on the hosts. this way external entities cannot access.

you could also run etcd in a container that also runs iptables, that way you can restrict access regardless of whether or not the host ( coreos ) supports firewalling.

deis-admin commented 7 years ago

From @davedoesdev on September 11, 2014 8:15

Is there a way to stop one application from accessing another application?

deis-admin commented 7 years ago

From @gabrtv on September 11, 2014 14:44

@davedoesdev not currently. In fact, that goes a level beyond what we're proposing here.

If you have time I would love to have you write up the use-case in a separate GitHub issue so we can come up with some proposals for addressing it.

deis-admin commented 7 years ago

From @iamveen on September 24, 2014 7:7

Perhaps some inspiration can be drawn from the geard implementation.

I think it would be pretty awesome if fleet had something like this built-in, rather than relying on docker linking. Something about seeing all those env vars just gets under my skin :\

deis-admin commented 7 years ago

From @davedoesdev on September 24, 2014 8:33

It'd be great to have support for https://github.com/zettio/weave/

deis-admin commented 7 years ago

From @intellix on September 24, 2014 8:53

Also confused by this issue. I thought etcd was primarily for discovering services in your application. Like my API asking etcd for the host/port of my database. If it can't access etcd then how do I know where the services exist?

deis-admin commented 7 years ago

From @carmstrong on September 24, 2014 17:14

I thought etcd was primarily for discovering services in your application.

etcd is for the Deis control plane's services to coordinate with each other. Applications shouldn't be concerning themselves with etcd, and instead should be using environment variables to configure things like database endpoints.

deis-admin commented 7 years ago

From @davidillsley on November 16, 2014 13:27

I'm wary of the statement "In practice, this is really only a concern when clusters are running untrusted applications.". It's also a concern when you have multiple applications, one of which has a security vulnerability allowing the application access to etcd to be exploited.

Is this the appropriate forum to discuss changing that text, or should I raise another issue?

deis-admin commented 7 years ago

From @carmstrong on November 16, 2014 20:50

@davidillsley That's a great point. I think this issue is a great place to discuss appropriate changes.

deis-admin commented 7 years ago

From @blaggacao on January 20, 2015 9:27

Out of curiousity, has been thought to swap etcd for consul as the K/V backend? It comes with acl. Might be sensible to continue using etcd for the control layer until the guys at CoreOS catch up and meanwhile introduce consul for the application panel.

@carmstrong wouldn't environment variables not making troubles when morphing and dynamic in the context of the application runtime? - not sure what that would be good for, but I would best guess for some low level multitenancy, which is managed by the application itself (or better: some intermediate controller code). Let's say a dynamic database creaction with automated credentials creation and propagation. There we are still on the hosting level.

Update: Maybe we could conceptually think of an platform controller panel, with it's highly segregated lightweight autochtonuous backing services and platform-wide first level backing services (e.g. ceph, dns, networking) and a more flexible application controller panel which implements opinionated 2nd level backing services (eg DB, K/V, monitoring, etc), but in the sense "opinionated by the operator towards the developer", not that much "by the publiser towards the operator"...

deis-admin commented 7 years ago

From @gabrtv on January 21, 2015 17:58

Out of curiousity, has been thought to swap etcd for consul as the K/V backend?

Yes, we have actively explored Consul as it can potentially add a lot of value around health monitoring and service discovery. Unfortunately, it's also a lot of duplicated infrastructure (another raft cluster) and a significant engineering effort. As a result, it's not a high priority at this moment.

In the interest of providing better network segregation and security, how do folks feel about:

  1. Separate k/v stores (and raft clusters) for platform-level and application-level concerns?
  2. Enforcing an overlay network to isolate application traffic from platform traffic?
deis-admin commented 7 years ago

From @blaggacao on January 22, 2015 3:44

:+1:

  1. as a "soft" design-rule / documentation: make the better part of the application file system(s) read-only? Not sure though if this is really an effective measure, however it seems to me that in 12-factor apps this becomes an option. Or did I conceptually miss something?
deis-admin commented 7 years ago

From @bacongobbler on January 22, 2015 6:37

The filesystem as per Heroku's cedar stack is ephemeral, so that's been the design we've been following. Same goes for any other PaaS out there*. Is there a specific reason that a read-only filesystem would be necessary?

Note: Heroku used to have a read-only filesystem in their bamboo stack which you could only write to ./tmp and ./logs, but they deprecated that in favour of documenting their dynos as ephemeral when they migrated over to the cedar stack.

*: Stackato, Cloud Foundry, Flynn, Dokku, etc.

deis-admin commented 7 years ago

From @blaggacao on January 22, 2015 7:49

Beware, it's not an indepth opinion of mine and maybe in the wrong tradeoff scope.

The intention behind was to close down the runtime app rather completely to injection of malicios tools as it's probably the app itself which is the vulnerable point. I think this should never be enforced in any way, I just thought of building my containers this way and deploying as container images directly.

I'm just guessing loud, don't know much about actual attack patterns. However not being able to write to a filesystem sounds secure. Extra compromising tools would have to be loaded and executed directly to ram... persistent store can be monitored for/closed down against improper write access specially. It just would make any attack pattern involving fs writes more dificult.

Maybe this could be an interesting pattern for (some parts of) the control plane as well, reducing surface.

deis-admin commented 7 years ago

From @apps4u on February 5, 2015 0:19

Im not a expert in Deis or CoreOS so this might be a bad idea , but would there be a way to create VLANS then have deis containers on a vlan that can acccess the required service but have all other containers on a different VLAN that is locked from access things like ETCD , This seems like a easy solution so Im guessing this wont work or some one would of thought of it but I thought I might as well ask.

deis-admin commented 7 years ago

From @wenzowski on February 24, 2015 5:20

Seems to me that #3072 must be considered when undertaking this if it is to be implemented in iptables like geard does. In terms of etcd access, what about binding keys to environment varibles on the container and sighup'ing it whenever the environment variables change?

deis-admin commented 7 years ago

From @azurewraith on February 25, 2015 19:39

In the interest of providing better network segregation and security, how do folks feel about:

  • Separate k/v stores (and raft clusters) for platform-level and application-level concerns?
  • Enforcing an overlay network to isolate application traffic from platform traffic?

Seems reasonable to me. What is the game plan for driving this home?

deis-admin commented 7 years ago

From @carmstrong on February 25, 2015 21:34

Seems reasonable to me. What is the game plan for driving this home?

For a change like this, we'd like the larger Deis community's input. Next step would be to open a proposal PR (see #2911 for a good example) adding documentation around the segregation, as if it had already been implemented. Once everyone agrees on that, implementation can commence with a high degree of confidence that it will be merged without significant changes.

deis-admin commented 7 years ago

From @Brandl on May 1, 2015 17:15

Dear Deis Developers,

I came across your software today and read through you documentation. It comes really promising and close to what I am looking for. I was really scared, when I came across the security section:

"Deis is not suitable for multi-tenant environments or hosting untrusted code."

If this is actually the case, I would call this a security critical bug, because if there actually is a scenario, where getting a web app hacked imposes a threat to all other containers and the actual host, that would be terrifying.

Maybe I'm just misinterpreting this issue, so what is the actual impact of this?

deis-admin commented 7 years ago

From @apps4u on May 2, 2015 2:44

just don’t run this if you are left users who you don’t trust like random online user. its not designed to work like that. Its safe when say a company runs all its containers on the platform Jason Kristian | Director | Apps 4 U Pty Ltd. ph: +61 075699 8109 mob: +61 0411 389 392 e: Jason Kristian mailto:jasonk@apps4u.com.au

On 2 May 2015, at 3:15 am, Brandl notifications@github.com wrote:

Dear Deis Developers,

I came across your software today and read through you documentation. It comes really promising and close to what I am looking for. I was really scared, when I came across the security section:

"Deis is not suitable for multi-tenant environments or hosting untrusted code."

If this is actually the case, I would call this a security critical bug, because if there actually is a scenario, where getting a web app hacked imposes a thread to all other containers and the actual host, that would be terrifying.

Maybe I'm just misinterpreting this issue, so what is the actual impact of this?

— Reply to this email directly or view it on GitHub https://github.com/deis/deis/issues/986#issuecomment-98180216.

deis-admin commented 7 years ago

From @azurewraith on May 2, 2015 15:50

@Brandi has a point, even if you trust all your users a security breach (after all we are on the cloud) can jeopardize the security of the other containers / deis core.

deis-admin commented 7 years ago

From @gabrtv on May 2, 2015 18:56

@Brandl you are correct that a compromised container can pose a threat to other containers and anything accessible over the network (including the underlying host).

Ideally containers would run on an isolated-by-default network segment with explicit access grants to endpoints as permitted (e.g. other containers, control plane infrastructure and external/third-party services). Unfortunately, achieving this level of network security is non-trivial and we want to be open about that, hence the disclosure in our documentation.

I do want to be clear that there a big difference between running untrusted code by design and running it by accident via application compromise. While the net result is the same (untrusted code running inside the cluster), the latter scenario is no different in Deis than what you'll find on comparable container platforms or orchestration systems like Mesos, Kubernetes, etc.

We do intend to support multi-tenant environments eventually. We are exploring technologies along the lines of ambassadord and Apcera's semantic pipelines (not open source) to help us get there. We also need help from upstream projects like Docker and etcd. However, until we undergo an audit by a 3rd-party security vendor, it's unlikely we will remove the disclaimer re: true multi-tenancy.

deis-admin commented 7 years ago

From @apps4u on May 3, 2015 4:38

That is correct but they are two different things having your application hacked is different to running code that is designed to be unsafe. So running code in a container that is designed to bring down your cluster could happen if you run untrusted code but if your application was hacked then some one got access to the under laying system is not a Deis issue it would also be a issue if you ran Core OS cluster or a Docker Cluster .
So yes the platform is secure as any other like system but if you let some run a container that was designed to cause you issue then that could be a problem . Till Deis is truly safe to run in a multi tenant enviroment you should make sure yo run trusted code then you will be safe just unless you leave a whole in your application but that risk is there whether you run Deis or Not.

Jason Kristian | Director | Apps 4 U Pty Ltd. ph: +61 075699 8109 mob: +61 0411 389 392 e: Jason Kristian mailto:jasonk@apps4u.com.au

On 3 May 2015, at 1:50 am, azurewraith notifications@github.com wrote:

@Brandi https://github.com/Brandi has a point, even if you trust all your users a security breach (after all we are on the cloud) can jeopardize the security of the other containers / deis core.

— Reply to this email directly or view it on GitHub https://github.com/deis/deis/issues/986#issuecomment-98372477.

deis-admin commented 7 years ago

From @bacongobbler on July 16, 2015 17:44

related: #3812

deis-admin commented 7 years ago

From @nlsrchtr on September 24, 2015 8:6

Maybe the Authentication and Authorization features introduces into etcd 2.1 could help solve part of the problem. Combined with a network layer based security, like drafted in #3812, would make it even better.

deis-admin commented 7 years ago

From @jokeyrhyme on November 10, 2015 0:24

Regarding @nlsrchtr 's point, it does seem as though key prefixes can now be used to restrict access in etcd: https://github.com/coreos/etcd/issues/2384 From my naive glance at the Godeps, it seems etcd 2.0 is being used. Is there work underway to upgrade to etcd 2.1 or newer?

deis-admin commented 7 years ago

From @bacongobbler on May 26, 2016 18:0

bumping this thread now with v2-related discussion. To bring us all up to speed on the release candidate:

The last thing to resolve would be true network segregation based on which namespace the application is deployed. The application should be able to communicate with any other apps deployed in the same namespace to facilitate #4173, but not with the kubernetes API server, the Workflow API server, or any other application outside of its own namespace (internally, at least). Applications should still be able to go surf the public web, as well as target applications through the router. For example, application foo can talk to application bar via http://bar.example.com (through the router), but not through the pod's IP address (10.247.144.126) or through DNS ($ ping bar) unless bar happened to be deployed in the same namespace as foo.