30x / project-management

Tasks not specific to a given project, exploratory stuff and project management
0 stars 0 forks source link

Separate the router namespace from the build-deploy namespace #135

Closed mpnally closed 8 years ago

mpnally commented 8 years ago

The routers are a fundamental part of the cluster that are used by every app in every namespace in the cluster. The deployment (enrober) and build (shipyard) applications are Apigee applications — for example they use Apigee SSO. I do not think they should be in the same namespace as the routers.

whitlockjc commented 8 years ago

We need to figure out our strategy for using namespaces to organize deployments. Regardless, this is what this effort entails:

  1. Updating the shipyard-deployment scripts
  2. Updating (potentially) the authz and NetworkPolicy rules
noahdietz commented 8 years ago

What do we want to do here? Most likely what will happen is that I'll need to add another NetworkPolicy to open traffic from whatever the k8s-router namespace will be. This will mean a new congress image version, so if we can have a quick decision, that'd be great :) @whitlockjc @mpnally

mpnally commented 8 years ago

Here is a simplified proposal we could implement before beta:

whitlockjc commented 8 years ago

To me, the routers and congress are cluster-wide features that are not specific to Shipyard. So unless we have a better place to put them, I'd suggest they stay in apigee. That means that enrober and kiln need to be in the same namespace called shipyard as that will be where we deploy any Shipyard-specific stuff. Thoughts?

mpnally commented 8 years ago

I'm trying to get you to give up the name Apigee. I think that the namespace 'Apigee' should be reserved for applications that are deployed by Apigee on top of shipyard/kubernetes, not for applications that are used to implement shipyard/kubernetes. I don't really care what namespaces the shipyard team uses to implement routers, congress, kiln, and enrober, so long as it is not Apigee. The name Apigee should be reserved for the broader Apigee community to deploy their apps, not land-grabbed by the shipyard team.

Also, we separated kiln from enrober because kiln needed to have special privileges in order to create Docker images. We wanted to minimize the amount of code running with those special privileges, so we separated kiln into its own namespace. I think this was a good decision that we should keep.

On Tue, Sep 6, 2016 at 1:54 PM, Jeremy Whitlock notifications@github.com wrote:

To me, the routers and congress are cluster-wide features that are not specific to Shipyard. So unless we have a better place to put them, I'd suggest they stay in apigee. That means that enrober and kiln need to be in the same namespace called shipyard as that will be where we deploy any Shipyard-specific stuff. Thoughts?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/30x/project-management/issues/135#issuecomment-245087938, or mute the thread https://github.com/notifications/unsubscribe-auth/ACoA6nUEcjrzOtIm7gMxDiGMKt2_Oarhks5qndLvgaJpZM4JrhOm .

Martin Nally, Apigee Engineering

whitlockjc commented 8 years ago

So unless we have a better place to put them, I'd suggest they stay in apigee.

I've already agreed that apigee is not the place for congress and k8s-router but we are yet to make a better suggestion as to where they should be. Basically, congress and k8s-router should be deployed somewhere where the deployments for that namespace are treated as cluster-wide, infrastructure deployments.

As for enrober and kiln, I don't think we have the security issue we thought we did. With network isolation in place, and the DenyEscalatingExec admission controller, we should not need to do this separation. All we have to do is enable this at the cluster level, something we should be doing based on best practices anyways, and update our kiln deployment.

To avoid continuing the loop, here are my suggestions for where our infrastructure deployments should be:

Those are the best I can come up with off the top of my head.

mpnally commented 8 years ago

Suggest we write a design for this. What was the 'special privilege' that kiln needed? I think had something to do with the ability to access the Docker daemon on the host, but I'm not sure. I assume this privilege was granted at the namespace level, and therefore the goal is to have as little software as possible running in the namespace that has this privilege.

shipyard is not the place for these cluster-wide features

That probably depends on what you think Shipyard means. I'm not sure I know, but you seem to have a strong opinion. Can you explain what it means to you?

On Tue, Sep 6, 2016 at 3:55 PM, Jeremy Whitlock notifications@github.com wrote:

So unless we have a better place to put them, I'd suggest they stay in apigee.

I've already agreed that apigee is not the place for congress and k8s-router but we are yet to make a better suggestion as to where they should be. Basically, congress and k8s-router should be deployed somewhere where the deployments for that namespace are treated as cluster-wide, infrastructure deployments. I agree that the aforementioned deployments should not be in apigee but we are yet to come up with a suggestion and shipyard is not the place for these cluster-wide features.

As for enrober and kiln, I don't think we have the security issue we thought we did. With network isolation in place, and the DenyEscalatingExec admission controller, we should not need to do this separation. All we have to do is enable this at the cluster level, something we should be doing based on best practices anyways, and update our kiln deployment.

To avoid continuing the loop, here are my suggestions for where our infrastructure deployments should be:

  • apigee-infrastructure
  • apigee-system
  • cluster-infrastructure
  • cluster-system
  • global-infrastructure

Those are the best I can come up with off the top of my head.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/30x/project-management/issues/135#issuecomment-245121430, or mute the thread https://github.com/notifications/unsubscribe-auth/ACoA6lUzgBJhyfKlD-jWZeYDJCIBrxmzks5qne9ZgaJpZM4JrhOm .

Martin Nally, Apigee Engineering

mpnally commented 8 years ago

Maybe the confusion is mostly about labels. If we thought of the system in layers, we might have:

  1. Layer 0 - Kubernetes
  2. Layer 1 - Routers and congress. [I would be tempted to call this layer 'shipyard', but I think you have a different view of what that word means.] There could be a competing layer 1 on the same cluster that used a different 'ingress' on a different port and a different set of isolation policies.
  3. Layer 2 - Enrober and Kiln. It remains to be seen whether Enrober is needed long-term or whether it can be reimplemented as a set of extensions/constraints on kubectl. Kiln is not used in every scenario.

Does that match your view (naming apart)?

mpnally commented 8 years ago

Naming is hard. I think the list of names you gave is good. I can think of a few others, but I'm not sure they are better. Some of the things this does are routing, isolation, multi-tenancy. I don't know if that gives you any ideas.

On a somewhat-related topic, is there a design document for Congress that describes exactly what it does and why? I was a bit surprised by Noah's comment that a namespace name change would require a code change in Congress.

whitlockjc commented 8 years ago

I'm guessing here, and will investigate officially, but congress creates NetworkPolicy objects and part of that is allowing the namespace where the routers exist to reach to all namespaces. We should make this a configuration item. I'll get back to you.

whitlockjc commented 8 years ago

So, here is where congress comes into play with this change. Since the routers cross the namespace boundary, congress creates a bridge NetworkPolicy that allows all communicate to/from the apigee namespace. So if we move the routers out of apigee, we'd need to update congress. We should probably expose the router namespace as an environment variable instead of hard coding it.

Since this is not something we are doing right now, we don't need to address this. As of now, enrober and kiln have been put into the shipyard namespace and all communication has been verified. We also have setup our clusters to use the DenyEscalatingExec admission controller so we should be able to mark the kiln container as privileged and we no longer have the concern that you could escalate privileges from that container, which was the original reason enrober and kiln were in separate namespaces.