Kong / kubernetes-ingress-controller

:gorilla: Kong for Kubernetes: The official Ingress Controller for Kubernetes.
https://docs.konghq.com/kubernetes-ingress-controller/
Apache License 2.0
2.22k stars 592 forks source link

Common cache should only cache resources within watch-namespace #2409

Closed mkgrei closed 2 years ago

mkgrei commented 2 years ago

The watch-namespace option specifies which KIC should watch for.

https://github.com/Kong/kubernetes-ingress-controller/blob/ab7a79428d927de82e25f2a9b1f778f17ebd957a/internal/manager/config.go#L150-L152

However, due to the following implementation, namespace for the publish-service is implicitly added to watch-namespace. The modification was made to implement Gateway API. https://github.com/Kong/kubernetes-ingress-controller/pull/1998

https://github.com/Kong/kubernetes-ingress-controller/blob/ab7a79428d927de82e25f2a9b1f778f17ebd957a/internal/manager/setup.go#L63

https://github.com/Kong/kubernetes-ingress-controller/blob/ab7a79428d927de82e25f2a9b1f778f17ebd957a/internal/manager/setup.go#L98

https://github.com/Kong/kubernetes-ingress-controller/blob/ab7a79428d927de82e25f2a9b1f778f17ebd957a/internal/manager/config.go#L155-L156

This behavior is unexpected as KIC needs lots of extra permission to access resources in publish-service namespace as well.

Especially, the permission to access Secret resource is too strong. Since kong-proxy needs to provide TLS configurations, role of kong-proxy needs permission to access Secret resources across namespaces. Furthermore, since publish-service needs to be in the same namespace as kong-proxy, the above implementation implies KIC service account also automatically gains permission to Secret resources, which kong-proxy can access to. This situation is not desirable in multi-tenancy environment. (With service account token of KIC, service account token for kong-proxy can be obtained, and with service account token of kong-proxy, a wide range of Secret resources can be accessed to.)

In conclusion, the namespace of publish-service should not be included in watch-namespace. Originally, in order to implement Gateway API, only permission for accessing Service resource in publish-service is needed. However, the current implementation also requires access to Ingress/Service/Endpoints/Secret/UDPIngress/TCPIngress/KongIngress/KongPlugin/KongConsumer, which is insecure. The gateway controller should use its own cache to meet the principle of least privilege.

shaneutt commented 2 years ago

Thanks for the report @mkgrei.

Just wanting to confirm: in your multi-tenant environment with multiple deployments of the KIC are you able to mitigate this problem by using different namespaces for the different ingress controller deployments?

Also as far as impact goes, I expect the impact of this problem you're expecting here is that if someone were to wrest control of the controller/gateway pods they would be able to read resources (Secrets, namely) from other deployments. Are there any other impacts your foresee here that you want to enumerate?

mkgrei commented 2 years ago

Thank your for your response @shaneutt .

We are using "workspace" in Kong Gateway to realize multi-tenancy. (https://docs.konghq.com/gateway/2.8.x/configure/auth/kong-manager/workspaces/#main) Different user uses different (kong) workspaces and different KICs in different (k8s) namespaces, while sharing the same kong-proxy. The problem above cannot be mitigated by using different namespaces of k8s but can be mitigated by using publish-status-address option. https://github.com/Kong/kubernetes-ingress-controller/blob/ab7a79428d927de82e25f2a9b1f778f17ebd957a/internal/manager/config.go#L157-L159 As we are using KIC since 1.x, the behavioral change in publish-service option is confusing and especially in multi cluster environment, publish-service can be configured identically by provisioning the same namespace and service, while service address IP cannot. (Not to mention in order to use Gateway API, publish-service is needed.)

Basically we leave multi tenant users to deploy KIC by themselves, because KIC needs the user's kong account token to use their workspace. Thus with the current implementation and the use of publish-service, users will be able to gain extra permission to access cluster scoped Secrets, as most ingress proxy products have the permission to. The permission to view Ingress/Service etc. is not problematic, but only Secrets is our concern.

shaneutt commented 2 years ago

So I may have gotten confused and I want to confirm: you're experiencing this problem on KIC v1? At first I thought you were on v2 because you mentioned Gateway API support. In fact I see that the original issue report isn't using the issue template, could you please add the following information:

It sounds like "Steps to reproduce" are basically just "deploy multiple instances of the KIC, they will all have excessive secret permissions", but if you have any additional details or steps to add here please do.

mkgrei commented 2 years ago

I created the issue from code view, so I missed the issue template. Sorry about that.

We are using,

It sounds like "Steps to reproduce" are basically just "deploy multiple instances of the KIC, they will all have excessive secret permissions", but if you have any additional details or steps to add here please do.

Yes, that is the case.

Also I might found another problem. I did not test this yet, but just from the code reading. If there are Ingress resource in k8s namespace: kong-proxy, as all KIC will try to create corresponding kong Route, there will be host name collision. And in my understanding, kong cannot create Route with same host name in different workspaces. This problem can be avoided by not creating Ingress in the namespace of kong-proxy, but still this will be a limitation.

shaneutt commented 2 years ago

To have collisions of this sort would be somewhat expected when using the same namespace in a multi-tenant setup like this. We advise when deploying multiple instances and separate tenants of the ingress controller in v2.x.x+ currently to put each of them in their own namespace, and use --watch-namespaces to define the boundaries of namespaces which the individual tenants should be responsible for. If I'm not mistaken making such a change would alleviate the problems you originally reported? Note that we would still consider making future changes that would support a restricted multi-tenant deployment when using a single namespace, but as I currently don't have a sense of how long that would take to get released I would like to provide an interim solution, let me know your thoughts.

mkgrei commented 2 years ago

Yes, I agree. watch-namespace options should set the boundary for the tenant's responsibility.

The issue I was addressing is that publish-service implicitly expands watch-namespace when kong-proxy does not belong to watch-namespace.

We will use publish-service-address instead of publish-service to mitigate this problem for now.

As publish-service is expected to be set when using Gateway API resource, I hope the problem will be fixed before its release.

shaneutt commented 2 years ago

Understood, glad you have a workaround for the moment. I'm going to queue this for review by maintainers in one of our upcoming grooming (internal) grooming sessions and we'll look at getting this into a sprint for an upcoming release.

rainest commented 2 years ago

Can you clarify the issue/which Secrets you would not want to grant KIC access to?

There is no separate kong-proxy ServiceAccount, only the single KIC ServiceAccount (it's typically accessible to both since it's bound at the Pod level rather than per-container, but only KIC uses it).

The proxy does not talk to the Kubernetes API at all. KIC is responsible for reading Kubernetes resources and translating them into proxy configuration on the proxy's behalf: it reads the TLS Secrets referenced by Ingress/HTTPRoute/etc. resources, not the proxy.

You may have a ServiceAccount for the Deployment that runs the proxy because we create one by default (assuming the more common configuration where there's a single proxy and controller instance in the same namespace), but if you run only the proxy in that namespace and KIC instances in other namespaces, you can remove the ServiceAccount and associated Roles attached to the proxy deployment.

Normally the only Secrets I'd expect to live in the proxy Deployment's namespace are for the default proxy certificates (the certificates served when there's no matching SNI/certificate combination), the image pull/license Secrets, and the certificate Secrets for the Manager/Portal/admin API Ingresses if in use. Access to the proxy Service namespace would grant access to those only, not other Secrets used by the proxy--the proxy does not pull those from the API, the various controller instances push them to the proxy.

I wouldn't expect the default certificates to be in use for most multi-tenant environments (on the assumption that they wouldn't match any of the tenants' domains), though you could bake them into the image rather than mounting from a Secret if needed. For the rest, hybrid mode would allow you to isolate the other Secrets in a namespace (where the control plane instances reside) other than the publish Service namespace (where the data plane instances reside), though you'd still need to bake the cluster certificate into the dataplane images.

mkgrei commented 2 years ago

Thank you for your explanation, @rainest.

I see that there were two misunderstanding in my side. First, I thought the kong-proxy (data plane) was the one needed to list Secret resource, which is not the case. Second, what I thought as Service Account for kong-proxy (data plane) was actually Service Account for kong (control plane).

So the permission to list Secret is not necessary for kong-proxy (data plane) Service Account. As for kong (control plane), we are using hybrid mode at the moment, and deployed kong (control plane) and kong-proxy (data plane) in the same namespace.

We will consider separating the namespaces for control plane and data plane. I would still wish if namespace of publish-service can be excluded from watch-namespace.

rainest commented 2 years ago

From my end, there are two considerations for changing that current implementation:

  1. It's probably cumbersome to rework the controllers to only watch Services in the publish namespace. The watched namespace set is a manager-level setting (we currently only instantiate one of those and I don't think it's easy to make multiple managers talk to one another), and I'm not sure how well the controllers work without homogeneous permissions. I don't expect there's a one-size-fits-all approach in the code either for handling both configurations that want to watch the publish namespace (probably most single-tenant installs) and those that do not.
  2. We intend to rework the way that multi-tenancy works, with a view towards supporting it within a single controller instances that manages multiple data plane groups, with each group only receiving a subset of the visible configuration (e.g. the controller would read configuration for all tenants and then only deliver tenantA's configuration to tenantA's data planes, etc.).

Centrally managing the controller should obsolete this issue, though I'd note from there that the change is more on of convenience. There shouldn't be a reason tenants need to manage their controller instances currently--there's nothing I would expect in the controller-level settings that they'd need to manage. A provider-managed controller can still ingest tenant-created Ingresses and such. Do you know of a reason that requires direct user management of the controller Deployment, and if not, does managing those centrally address the security concerns?

mkgrei commented 2 years ago

As the WebUI is not capable of creating RBAC user for KIC to use, Admin user is needed to provision KIC. At least that was the workaround from Kong Gateway support team. This is a problem because, RBAC user belongs to one workspace and KIC needs to access some resources (tags etc.) in default workspace. It is possible in theory to bind permission to RBAC user from other workspace, however, WebUI is implemented in a way that disallows it. I will proceed assuming using Admin user access token for KIC.

Admin user needs an email address and has its own access token. This means that for our current configuration, in order to deploy KIC, access token of Admin user is needed. Admin user is created per tenant and tenant user uses this Admin user to access the WebUI of Kong Gateway. With this situation, for centrally managing the controller, we would need tenant users to inform the provider(us) their access token. As a provider, we also want to have least privilege to touch our tenants system. It is not desirable for us to know the users' access token.

I understand that for the current implementation, to take publish namespace out of watch-namespace needs some rework.

rainest commented 2 years ago

Do you have the ticket number in question? That's not correct regarding our RBAC requirements: KIC only needs access to the workspace it's configured to use so long as that workspace already exists. IIRC there were some bugs related to improper handling of workspaced requests that incorrectly sent some requests to default in the past, but I'm not aware of any at present. The single-workspace RBAC users are fine for it.

You can create am admin invite for your users to use the GUI, where only they would know the token, and a separate RBAC user for the controller that you manage. You already have de facto access to their system by virtue of managing, and having access to, the central database where it resides, as well as having the super admin credentials you use to create new users. Guarding that is a matter of internal access controls, and you'd apply the same controls to the individual instance RBAC users as you apply to the higher-level credentials that grant access to the entire system.

mkgrei commented 2 years ago

I see. I confirmed that RBAC account can be used for KIC (2.2.1).

Thank you so much! It solved many problems.

shaneutt commented 2 years ago

It seems as though this is resolved, let us know if this is not the case and we will re-open it. :+1: