kubernetes / kubernetes

Production-Grade Container Scheduling and Management
https://kubernetes.io
Apache License 2.0
108.53k stars 38.93k forks source link

Cross-namespace Ingress #17088

Closed paralin closed 8 years ago

paralin commented 8 years ago

As far as I can tell right now it's only possible to create an ingress to address services inside the namespace in which the Ingress resides. It would be good to be able to address services in any namespace.

It's possible that I'm missing something and this is already possible - if so it'd be great if this was documented.

bprashanth commented 8 years ago

Nope, not allowing this was a conscious decision (but one that I can be convinced against). Can you describe your use case? The beta model partitions users on namespace boundaries and disallows service sharing across namespaces. You might argue that you want a single loadbalancer for the entire cluster, to which I ask, what is in the namespaces? i.e why not use 1 namespace if you want to allow sharing.

paralin commented 8 years ago

@bprashanth I'm running multiple projects on a cluster - kubernetes tests, blog, API for a project. I want to address these as subdomains on my domain using a single ingress controller because load balancers and IP addresses are expensive on GCE.

liggitt commented 8 years ago

It would be good to be able to address services in any namespace.

It was intentionally avoided. Cross namespace references would be a prime source of privilege escalation attacks.

cc @kubernetes/kube-iam

paralin commented 8 years ago

I'll close this for now, makes sense.

thockin commented 8 years ago

FWIW you can set up a Service in namespace X with no selector and a manual Endpoints that just lists another Service's IP. It's yet another bounce, but it seems to work. :)

On Thu, Nov 12, 2015 at 5:14 PM, Jordan Liggitt notifications@github.com wrote:

It would be good to be able to address services in any namespace.

It was intentionally avoided. Cross namespace references would be a prime source of privilege escalation attacks.

cc @kubernetes/kube-iam https://github.com/orgs/kubernetes/teams/kube-iam

— Reply to this email directly or view it on GitHub https://github.com/kubernetes/kubernetes/issues/17088#issuecomment-156287304 .

krancour commented 8 years ago

I would tend to imagine the use case that @paralin described is common. I'm looking at an ingress controller as a system component and a means of reflecting any service in the cluster to the outside world. Running one (perhaps even in the kube-system namespace) that can handle ingress for all services just seems to make a lot of sense.

bprashanth commented 8 years ago

Cross namespace references would be a prime source of privilege escalation attacks.

That depends on what you have in your namespaces right (which is why i asked for clarification)? Isn't it only risky IFF you're partitioning users across a namespace security boundry?

bprashanth commented 8 years ago

There seems to be demand for cross namespace ingress x service resolution. We should at least reconsider.

erictune commented 8 years ago

I think we want some kind of admission controller which does:

if ! req.Kind = "Ingress" { return }
ingress  := req.AsIngress()
for each serviceRef field in ingress {
  if req.User is not Authorized() to modify the service pointed to by serviceRef {
    reject request
  }
}

Then, to modify an ingress you have to have and owner-like permission on all the services it targets.

paralin commented 8 years ago

Might be good to revisit this now in 2016 :)

bprashanth commented 8 years ago

@kubernetes/kube-iam thoughts/volunteers to implement an admission controller? Do we authorize based on the user field of a request today or is that unprecedented?

liggitt commented 8 years ago

to modify an ingress you have to have and owner-like permission on all the services it targets.

I think I'd want some record or indication of the cross-namespace relationship to exist, so the targeted service could know it was exposed. I want to avoid the scenario where someone had access to a service (legitimately or otherwise), set up ingress from other namespaces, then had their access removed and continued accessing the services without the service owner's awareness.

Do we authorize based on the user field of a request today or is that unprecedented?

The authorization layer is based on the user info on a request. This would be the first objectref authorization I know of.

erictune commented 8 years ago

@liggitt's raises some good concerns. I broke them down into two cases when thinking about them.

  1. assuming everyone is trustworthy, it might still be hard to reason about the network security of a service just by looking at the service object (or just by looking at objects in the same namespace). IT might be misconfigured.
    • I agree with this, to a point
    • however, creating an object that represents a connection between two services seems like it would scale poorly.
    • we need a solution that scales with the number of services, not the number of interconnections, I think.
  2. assuming there is someone not-trustworthy, they can misconfigure they network in a way where the misconfiguration persists after some of their access is revoked.
    • Yes. But we have this problem worse with pods, configmap, etc. The bad actor might have run pods that that are doing the wrong thing, and auditing this is very hard.
thockin commented 8 years ago

Is this moving into the topic of micro-segmentation and access policy?

On Wed, Jan 20, 2016 at 2:19 PM, Eric Tune notifications@github.com wrote:

@liggitt https://github.com/liggitt's raises some good concerns. I broke them down into two cases when thinking about them.

  1. assuming everyone is trustworthy, it might still be hard to reason about the network security of a service just by looking at the service object (or just by looking at objects in the same namespace). IT might be misconfigured.
    • I agree with this, to a point
    • however, creating an object that represents a connection between two services seems like it would scale poorly.
    • we need a solution that scales with the number of services, not the number of interconnections, I think.
  2. assuming there is someone not-trustworthy, they can misconfigure they network in a way where the misconfiguration persists after some of their access is revoked.
    • Yes. But we have this problem worse with pods, configmap, etc. The bad actor might have run pods that that are doing the wrong thing, and auditing this is very hard.

— Reply to this email directly or view it on GitHub https://github.com/kubernetes/kubernetes/issues/17088#issuecomment-173379992 .

erictune commented 8 years ago

yes.

ghost commented 8 years ago

The two main models proposed for network segmentation are:

1) decorate Services (and maybe Pods) with a field indicting "allow-from". This basically allows one to draw the directed graph of an application, sort of.

2) implement a "policy group" object which selects Pods to which to apply policy, and includes some simple policy statements like "allow-from"

On Thu, Jan 21, 2016 at 11:52 PM, Tim Hockin notifications@github.com wrote:

Is this moving into the topic of micro-segmentation and access policy?

On Wed, Jan 20, 2016 at 2:19 PM, Eric Tune notifications@github.com wrote:

@liggitt https://github.com/liggitt's raises some good concerns. I broke them down into two cases when thinking about them.

  1. assuming everyone is trustworthy, it might still be hard to reason about the network security of a service just by looking at the service object (or just by looking at objects in the same namespace). IT might be misconfigured.
  2. I agree with this, to a point
  3. however, creating an object that represents a connection between two services seems like it would scale poorly.
  4. we need a solution that scales with the number of services, not the number of interconnections, I think.
  5. assuming there is someone not-trustworthy, they can misconfigure they network in a way where the misconfiguration persists after some of their access is revoked.
  6. Yes. But we have this problem worse with pods, configmap, etc. The bad actor might have run pods that that are doing the wrong thing, and auditing this is very hard.

— Reply to this email directly or view it on GitHub < https://github.com/kubernetes/kubernetes/issues/17088#issuecomment-173379992

.

— Reply to this email directly or view it on GitHub https://github.com/kubernetes/kubernetes/issues/17088#issuecomment-173836967 .

erictune commented 8 years ago

@thockin what issue does one go to to learn more and comment?

thockin commented 8 years ago

It's being discussed in the network SIG mailing list as we haggle over multitudes of ideas and whittle it down to a few viable ones.

Start here:

https://docs.google.com/document/d/1_w77-zG_Xj0zYvEMfQZTQ-wPP4kXkpGD8smVtW_qqWM/edit

One proposal:

https://docs.google.com/document/d/1_w77-zG_Xj0zYvEMfQZTQ-wPP4kXkpGD8smVtW_qqWM/edit

Another is in email:

https://groups.google.com/forum/#!topic/kubernetes-sig-network/Zcxl0lfGYLY

On Fri, Jan 22, 2016 at 12:46 PM, Eric Tune notifications@github.com wrote:

@thockin https://github.com/thockin what issue does one go to to learn more and comment?

— Reply to this email directly or view it on GitHub https://github.com/kubernetes/kubernetes/issues/17088#issuecomment-174041517 .

erictune commented 8 years ago

Talked to @thockin and @bprashanth It sounds like the Ingress resource may undergo some refactoring this quarter, possibly splitting into two objects. We should revisit ingress security when those discussion happen.

wstrange commented 8 years ago

This would be a nice feature to have. For example, if you want a pseudo multi-tenant solution - with each tenant running in a separate namespace. The ingress could do hostname based routing to the right backend namespace. ${tenant}.example.com -> service "foo" in namespace ${tenant}

I suppose one can do this today on GKE, but I gather you end up with one HTTP load balancer per namespace - which could get quite expensive and seems unnecessary.

jimmycuadra commented 8 years ago

This limitation throws a big wrench in how my company was planning to use ingresses. Our use case is running multiple copies of the same application stack at different versions, and to keep the stacks isolated from each other, we use namespaces. We'd planned to run a single ingress controller that knows how to determine which applications and versions of those applications by the subdomain of the incoming requests.

The reason for using namespaces for isolating these stacks are:

  1. To be extra safe about not having applications interfere with each other based on what else happens to be running in the cluster or from similarly named services.
  2. To get around name collisions for services. It's not possible to have two services in the same namespace with the same name, so application dependencies like "redis" or "mysql" need to be in different namespaces to use those simple names without faking a namespace by changing the name of the service.

See my unanswered Stack Overflow question, Kubernetes services for different application tracks, for more details on our use case.

Our ingress controller is exposed to the outside world via NodePort (80 and 443), and we have an ELB in AWS pointing at the whole cluster. With the namespace restriction for ingresses, we'd need one ingress controller per namespace and there would be no way to have a single ELB forwarding ports 80 and 443 to the cluster.

paralin commented 8 years ago

@jimmycuadra You should use the approach that our group is using for Ingresses.

Think of an Ingress not as much as a LoadBalancer but just a document specifying some mappings between URLs and services within the same namespace.

An example, from a real document we use:

  apiVersion: extensions/v1beta1
  kind: Ingress
  metadata:
    name: ingress
    namespace: dev-1
  spec:
    rules:
    - host: api-gateway-dev-1.faceit.com
      http:
        paths:
        - backend:
            serviceName: api-gateway
            servicePort: 80
          path: /
    - host: api-shop-dev-1.faceit.com
      http:
        paths:
        - backend:
            serviceName: api-shop
            servicePort: 80
          path: /
    - host: api-search-dev-1.faceit.com
      http:
        paths:
        - backend:
            serviceName: api-search
            servicePort: 8080
          path: /
    tls:
    - hosts:
      - api-gateway-dev-1.faceit.com
      - api-search-dev-1.faceit.com
      - api-shop-dev-1.faceit.com
      secretName: faceitssl

We make one of these for each of our namespaces for each track.

Then, we have a single namespace with an Ingress Controller which runs automatically configured NGINX pods. Another AWS Load balancer points to these pods which run on a NodePort using a DaemonSet to run at most and at least one on every node in our cluster.

As such, the traffic is then routed:

Internet -> AWS ELB -> NGINX (on node) -> Pod

We keep the isolation between namespaces while using Ingresses as they were intended. It's not correct or even sensible to use one ingress to hit multiple namespaces. It just doesn't make sense, given how they are designed. The solution is to use one ingress per each namespace with a cluster-scope ingress controller which actually does the routing.

All an Ingress is to Kubernetes is an object with some data on it. It's up to the Ingress Controller to do the routing.

See the document here for more info on Ingress Controllers.

With this post I will close this issue because I think it's actually a non-issue - Ingresses work fine even for cross namespace routing.

Eventually the ingress object might be refactored / split. That would be a redesign of this concept. But as of now, this is how Ingresses are designed and meant to be used, so it only makes sense to use them the "right" way :)

paralin commented 8 years ago

I might open up another issue about actually properly documenting this with examples. Seems there's some confusions around Ingresses. They're really powerful when applied correctly.

jimmycuadra commented 8 years ago

I think your @-mention was the wrong person. :P

I understand the difference between ingress resources and the ingress controller, but I'm not sure I understand why my use case is not appropriate here, or how the setup you describe does the same thing we're trying to do.

In our imagined setup, the ingress controller and all ingress resources exist in one namespace. However, we create ingress resources that map host names to services in more than one namespace. This way we can access apps from outside the cluster as "my-app.app-version.example.com" and the request follows the path ELB --> random node --> ingress controller --> my-app in the my-app-my-version namespace.

paralin commented 8 years ago

Yeah whoops. I just assumed the first jimmy in the @ list was you.

There's absolutely no reason to have your ingress resources in one namespace, as far as I can tell. What's keeping you from putting the routes for your app into a single Ingress object and then replicating this object for each namespace you want to run? It just makes sense from a symmetry perspective - if every namespace has the same set of objects, including the ingress object, then everything will work properly...

jimmycuadra commented 8 years ago

I don't think duplication of ingress objects would be a big deal, but needing an ingress controller for each namespace would be a problem, as we want to have a single entrypoint for requests going into the cluster. If each namespace had it's own ingress controller, we'd need a separate NodePort allocation for each controller, and an ELB with its own DNS records for each namespace, which is not something we want.

paralin commented 8 years ago

@jimmycuadra Read through my comment again. I said you would have a single ingress controller for the entire cluster, exactly as you want...

jimmycuadra commented 8 years ago

It does look like I misunderstood what this issue was about! It sounds like ingress controllers do work across namespaces, but ingress resources cannot see outside of their own namespaces, which was the subject of this issue. Apologies for the confusion and thanks for your responses!

krancour commented 8 years ago

@jimmycuadra I had the same confusion up to a point.

gramic commented 7 years ago

I read this discussions many times and still do not understand what is the recommended way to achieve the desired goal of multiple sub/domains to be served from different namespaces.

My first workaround is to use single namespace with Ingress for everything that should be exposed via domain name to the world. And the second way is to not use Ingress, but a simple Nginx as a proxy to my apps in different namespaces.

Isn't the goal of Ingress to simplify this scenario? There is mentions about security implications if crossing the namespace. However, there is no simple explanation of them.

@paralin Could you please share more details about what are those pods, which reside with the Ingress Controller?

robhaswell commented 7 years ago

I think I need to be able to reference services in different namespaces from a single ingress controller. My use-case is doing blue/green deployments, with a "blue" namespace and a "green" namespace. My application's public traffic comes via an ingress controller. I would like to be able to route that traffic to blue or green services via making a small change to the ingress controller (i.e. the service namespace).

paralin commented 7 years ago

You're both still missing the point, you can use a single ingress controller for as many namespaces full of ingresses you want

gramic commented 7 years ago

@paralin Yes, I did understand that this is possible. What I am missing is how you do that?

Does the controller receive events for ingress resource in every namespace, no matter in which namespace the controller resides in?

krancour commented 7 years ago

you can use a single ingress controller for as many namespaces full of ingresses you want

And I was not understanding that when I weighed in on this way back when. Given my current understanding, I actually have come around to believing there's no issue or truly damning limitation here.

robhaswell commented 7 years ago

@paralin I believe I am the same as @gramic in that I do not understand how to achieve my goal. Please let me explain. I am trying to implement the blue/green deployment pattern using "blue" and "green" as my namespaces - each namespace hosts my application stack at different versions. My desire is to have a single Ingress resource routing traffic for https://myapp/ to, e.g. the blue namespace, and then at the point of release, make a change to that Ingress so that now traffic is being routed to my green namespace (without the public IP of that Ingress changing).

Unfortunately I believe all the solutions mentioned above require interaction with some entity outside of Kubernetes, e.g. an external load balancer, which is not desirable. If it helps, I'm on Google Container Engine.

I hope now that we are on the same page? My problem is that if I believe that without a service namespace selector, I can't achieve what I want to achieve.

paralin commented 7 years ago

@robhaswell nope, still doable without a namespace selector. Remember that an ingress controller does not change ip between different ingress resources. It just combines them together and serves them.

Try running the nginx controller for example, and setting up what you want. You can either use a different URL in one namespace, and then change it to the main URL when you want to enable that namespace (kubectl edit it) or you can write a little bash script that deletes the ingress object in one namespace and immediately recreates it in the other.

The main thing I think you're missing is that the ingress objects are just data. Deleting them doesn't actually cause kubernetes to delete any resources. The ingress controller however is watching for changes to these resources and uses them to update its configuration. For the nginx controller this means updating the nginx.conf which does not change anything about how you chose to route traffic to the nginx pods. The IP remains the same.

paralin commented 7 years ago

@gramic yes. It depends on their implementation, but most of the controllers monitor all the namespaces on default (which is changeable)

robhaswell commented 7 years ago

@paralin thanks for your help, however I believe that it is not possible to change the namespace of an ingress object using kubectl edit:

$ kubectl edit ingress/tick-tock
A copy of your changes has been stored to "/var/folders/jv/_p33nwxx0jd8b7gr2qgx0mrc0000gr/T/kubectl-edit-fplln.yaml"
error: the namespace from the provided object "snowflakes" does not match the namespace "default". You must pass '--namespace=snowflakes' to perform this operation.

In this operation I attempted to change the namespace of the Ingress resource to snowflakes, where it was previously default.

Additionally if I delete this (my only) Ingress resource, I lose the IP address which Google Cloud load balancing has provisioned for me. Perhaps this behaviour would change if I had another Ingress resource.

stephanlindauer commented 7 years ago

one thing i don't seem to be able to understand is: why insist on ingresses not being able to use services from other namespace, when there aren't any security measures in place (as far as i know) to prevent any pod from just using the <service-name>.<another-namespace-name>.svc.cluster.local fqnd to get data from another namespace. or is the fact, that this is possible a security flaw, that will be fixed in future versions?

update: this is what i mean https://github.com/kubernetes/kubernetes/issues/17088#issuecomment-157187876

wstrange commented 7 years ago

@stephanlindauer I believe the intent is that cross namespace service access will not be allowed in the future.

stephanlindauer commented 7 years ago

put then the question still stands: why not have a certain kind of flag or even kind to allow cross-namespace access to certain pods or have a service definition which namespaces they can be accessed from. without this there would be a n:n (namespace:ingress-host) relation between namespace and hosts that expose them, while it would be nicer to have a n:1 relation to cut down on costs for additional public facing static-ip hosts.

paralin commented 7 years ago

@robhaswell I'm not familiar with how the Google Cloud router works with Ingress right now.

However. If you use the nginx ingress controller and point a LoadBalancer at the ingress controller you will get what you want - a single stable IP address TCP load balancer created by the Service pointing to a horizontally scaleable nginx layer. Moving Ingress objects between namespaces just updates the Nginx configuration, and does not release any IP addresses. It'll also give you more direct control over how the router works, and might save money on Google Cloud resources.

The Google Cloud implementation SHOULD keep a single IP address and make HTTP rules in that balancer to point to however many Ingress objects exist in the cluster - that might not be the case right now.

robhaswell commented 7 years ago

@paralin thanks for the suggestion, I haven't tested it, however I am leaning towards a similar solution for my specific requirement by using a LoadBalancer as you suggest to point to my own nginx-based router's service. This approach uncovered a bug in kube-proxy, see #40870.

I'm afraid that I have currently put this issue down while I am working on other priorities, but I will be able to return to this problem and conduct some conclusive tests of this suggestion and other suggestions in this thread. I apologise that I don't have time to do this right now. I also think that, so far, efforts in achieving my goal should be documented in a new thread entitled "Support blue/green deployments", as this issue is very clearly "support cross-namespace ingress" and the answer seems to be a firm "no, we won't".

robhaswell commented 7 years ago

You might argue that you want a single loadbalancer for the entire cluster, to which I ask, what is in the namespaces? i.e why not use 1 namespace if you want to allow sharing.

@bprashanth in my instance, my frontend app connects to its component services using DNS as service discovery. They connect to a static name, e.g. postgres or redis. By deploying different versions in different namespaces, we get 100% satisfactory service discovery. We are unwilling to implement a different method of service discovery for the following reasons:

  1. More complexity brings more opportunity for failure.
  2. More per-deployment environmental configurations brings more opportunity for failure through less deployment parity.
  3. Our local dev setup is using pure Docker networking which is compatible with this DNS-based approach (which is a design feature of kube-dns, I believe).

We are very satisfied with this approach in every way, with the exception that cross-namespace load balancing (switching) is impossible with pure-k8s at the moment.

robhaswell commented 7 years ago

FWIW you can set up a Service in namespace X with no selector and a manual Endpoints that just lists another Service's IP. It's yet another bounce, but it seems to work. :)

@thockin at some point between you making this comment and today, this functionality regressed. See #40870. I would appreciate it if you could accelerate @taimir's request for a second opinion on that thread please!

paralin commented 7 years ago

If you need a huge design change in something like Kubernetes to do a relatively common deployment pattern, you're doing it wrong.

I rest my case, but urge you to rethink your approach. Reread the docs on how DNS and pod ips work, particularly around multiple / cross namespace communication. Namespaces are used usually in a one namespace per deployment fashion - one for prod, dev, etc although the isolation they provide can be used for other things too.

robhaswell commented 7 years ago

@paralin please re-read my response to your suggestions. You suggested using a LoadBalancer service to direct traffic to the correct endpoint. I have tried this and this functionality does not work as described in the documentation. I tried your suggestion of multiple Ingress resources, and that didn't work. I also took inspiration from this, and tried having multiple Ingress resources in different namespaces, and changing which HTTP host they were configured for. This also didn't work - traffic was never re-routed, and I could confirm that the underlying Google Cloud routing rules had not been updated. I apologise for not reporting this, however at that point I had already given up on using Ingress for my solution - I understand and accept that this is not what it's for. I have merely been replying to your suggestions. That is why I intend to open a new issue, as I have not yet seen any workable solutions that function as advertised.

I'm not sure why you have now taken this hostile attitude, but I am disappointed that you seem to think I am stubbornly ignoring your help. Trust me, I am VERY eager to solve this problem. Thank you for continuing to help me, but please could I ask that if you think you have suggested workable solution, could you re-iterate it so there is less confusion between what you think works, and what I think works?

paralin commented 7 years ago

You've still not read and understood my suggestion, which means at this point you're probably skipping reading anything I'm posting at all. Remember that I'm the one that made this issue in the first place, and yes, I was in the exact same position with the exact same deployment model as you.

I suggested using a Nginx Ingress Controller from the official kubernetes ingress controller implementations, and directing a load balancer at that ingress controller to get a stable IP. From what you've said above it seems you misunderstood.

This is exactly what ingress is for, so don't give up yet.

robhaswell commented 7 years ago

OK, I think I understand your suggestion, but let me play that back to you:

All good so far.

Now we're serving the newest release with a small loss of service.

If this is what you're suggesting, then I'm not really inclined to proceed with Ingress, and would rather pursue the service-to-service approach as described in the documentation and suggested by thockin. However, I can see that it would just about fit our use-case, so thank you.

paralin commented 7 years ago

Ingress is designed to do http level routing of traffic while services do TCP level routing. For your use case I would use a subdomain to route to the dev environment for sure.

Usually people don't switch entire namespaces to do deployments, so kubernetes isn't designed towards that model as heavily. However, I can understand why you're doing it, and I definitely think you'll do fine with this approach.

You shouldn't see a service interruption by the way if you do it with a script that swaps them around immediately. The nginx config should be updated immediately and nginx doesn't have to restart to apply the change and start rerouting traffic.