kubernetes-sigs / gateway-api

Repository for the next iteration of composite service (e.g. Ingress) and load balancing APIs.
https://gateway-api.sigs.k8s.io
Apache License 2.0
1.75k stars 453 forks source link

GEP: Gateway/HTTPRoute level authentication #1494

Open justinsb opened 1 year ago

justinsb commented 1 year ago

What would you like to be added:

I would like to be able to enforce authentication & (limited) authorization at the Gateway or HTTPRoute level, so that I can safely route traffic to apps.

Why this is needed:

I'd like to be able to launch kubernetes apps, set up an HTTPRoute for them, and enforce at the gateway level that the user must be logged in (with some restrictions, e.g. *@example.com). Apps can still support perform authentication, but simple read-only apps might not need to. Broadly, I'm trying to follow the approach in the BeyondCorp paper. Doing this greatly improves the security surface of the system (at least when there's less concern over attacks from authenticated users).

Possible mechanisms (mostly to clarify the problem):

Ideally this would be at the Gateway level, but I can understand that this might need to be duplicated at the HTTPRoute level if it's considered an HTTP concern.

(I'm new to the project, and not really sure whether this is the right place to raise this. It's not clear whether I should instead - somehow - choose an implementation and then pursue it there, and then the gateway API standardizes features once enough implementations have support. If there's a particular implementation that is a better path for me here, please LMK!)

shaneutt commented 1 year ago

Would you please link the paper that you're referring to? Thank you.

tokers commented 1 year ago

A gateway-scoped authentication support makes sense or users have to add authentication for all routes if they want a global api protection.

justinsb commented 1 year ago

@shaneutt BeyondCorp has its own mini-site here https://cloud.google.com/beyondcorp and generally describes an approach where an organization puts services that would otherwise be on a private network / corporate VPN on the public internet, but with a gateway enforcing authentication (and more). I say "and more" because the authentication can include things like machine certificates. All the papers are good, but https://storage.googleapis.com/pub-tools-public-publication-data/pdf/45728.pdf is probably the most directly relevant here.

In general though, I'm just trying to find a mechanism in the API to enforce auth, for example ingres-nginx had auth-url and auth-signin annotations: https://kubernetes.github.io/ingress-nginx/examples/auth/oauth-external-auth/ .

(It is possible I'm just missing it in the API - I'm not sure if you're saying it does exist at the HTTP route level today @tokers ?)

youngnick commented 1 year ago

I think that we've always intended to support some method of configuring auth, but haven't got around to specifying it yet.

But when I look, we've also never logged an issue to track that. Thanks @justinsb, I guess this is now the tracking issue. šŸ˜„

This one definitely requires a GEP specifying how it will work.

The strawman idea I've had (once I finish fixing up Policy Attachment to allow it as part of moving it to Standard in #713) is that Authentication settings should be specified with a HTTPFilter that can be defaulted or overridden using an AuthenticationPolicy. That will allow you to:

Which I think should cover most of the use cases I can think of.

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

maleck13 commented 1 year ago

Not sure this is the right issue to add this, happy to be redirected or open something separate if needed. I wanted to chime in on the subject of Auth and policy attachment. Would love to learn more about any potential improvement to policy attachment.

One of the values I see with policy attachment for this kind of thing (auth), is it allows 3rd parties to implement the solution as a policy and provide the underlying integration and wiring with the gateway providers of their choice behind the scenes. Filters as I understand them would be provider specific?

We have a project that is defining APIs that use some of the concepts in policy attachment https://github.com/Kuadrant/kuadrant-operator AuthPolicy: https://github.com/Kuadrant/kuadrant-operator/blob/main/config/samples/kuadrant_v1beta1_authpolicy.yaml RateLimitPolicy: https://github.com/Kuadrant/kuadrant-operator/blob/main/config/samples/kuadrant_v1beta1_ratelimitpolicy.yaml These don't fully implement policy attachment yet. With complex policies we are struggling and wresting with how to implement overrides and defaults but see the power in allowing a GW admin to define a sane default and have that overridden lower down the hierarchy where there is more knowledge. I hadn't considered an option where a policy would override a filter as you describe it @youngnick would definitely be interested in learning more about that, and whether there has been any consideration given to allowing filters provided by none gateway providers?

To piggy back on this, comment/issue and the area of policy attachment, one additional difficulty we find with policy attachment, especially when targeting a HTTPRoute, is the desire to target a particular path or method with the policy. I would love to know your thoughts on how that should be achieved @youngnick would we expect multiple HTTPRoutes each with their own policy or a policy to define a mechanism to select a sub section of the route?

youngnick commented 1 year ago

In general, I personally have been expecting that it would be Gateway controllers reconciling associated Policy objects that change its own data plane. I hadn't considered the idea of using Policy objects as a coordination point between a data plane and other plugins, but I suppose it could be done. The good part is that the Gateway API shouldn't really care about how the Policy is implemented, just that it is.

Having a Policy influence the configuration of a Filter is part of #1565, I see this as allowing a Gateway owner to do things like say "every HTTPRoute attached to this Gateway gets this custom filter with these settings, that is overridable by the HTTPRoute owner" (for the defaults use case).

For targeting some part of a resource with a Policy, we've already got https://gateway-api.sigs.k8s.io/geps/gep-713/#apply-policies-to-sections-of-a-resource-future-extension, which talks about how we could attach a policy only to some subsection of an object. Obviously, that's still future work, but I think something like that is quite likely, see https://github.com/kubernetes-sigs/gateway-api/discussions/1489 for some more discussion (and please feel free to put your comments on there as well).

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

youngnick commented 1 year ago

/remove-lifecycle rotten

guicassolato commented 1 year ago

I hadn't considered the idea of using Policy objects as a coordination point between a data plane and other plugins, but I suppose it could be done. The good part is that the Gateway API shouldn't really care about how the Policy is implemented, just that it is. https://github.com/kubernetes-sigs/gateway-api/issues/1494#issuecomment-1439456953

Good to know, @youngnick. I'm a member of Kuadrant with @maleck13 where we've been working on leveraging Policy Attachment for extending gateway functionalities. As we have no plans to provide a Gateway implementation ourselves, we're betting on our own PA metaresources/CRDs. We're concentrating on rate limiting, authn/authz and multi-cluster traffic control (DNS, LB, workload placement). Our control plane works with the underlying implementation (currently Istio, possibly with Envoy Gateway following up) to inject gateway/route configuration that connects to our functional components (external authz, global rate limiting).

we've already got https://gateway-api.sigs.k8s.io/geps/gep-713/#apply-policies-to-sections-of-a-resource-future-extension, which talks about how we could attach a policy only to some subsection of an object. Obviously, that's still future work https://github.com/kubernetes-sigs/gateway-api/issues/1494#issuecomment-1439456953

Thanks for pointing out to this and to https://github.com/kubernetes-sigs/gateway-api/discussions/1489! Having the targetRef's sectionName implemented is something that would really help us out. Until then, we came up with a (hopefully temporary) solution for selecting individual route rules. It's far from perfect, but it will allow us to make some quick progress.

Our next move involves (i) defaults and overrides and (ii) rolling out the same language for auth, where we hope to leverage Inherited/Hierarchical Policy Attachment even more, and which brings me to this issue. We'd love to contribute, @justinsb!

My colleague @alexsnaps is already gathering some thoughts around rate limiting; I'd be happy to do the same for auth. I.e. wrapping up a doc of our experience, if you guys find that can be useful.

I've checked previous references here but perhaps there's something on auth already that is more more up to date where I could also jump in, either to dump some thoughts or to link the doc to maybe?

youngnick commented 1 year ago

Thanks for this @guicassolato. There isn't really anything written down, aside from my very early strawman thoughts I wrote above.

The idea there is that you could use a default Policy to add an ExtensionFilter that specifies the auth, and specifies the filter's settings, and individual HTTPRoutes could then opt out by specifying an empty extension, or a null extension or something.

But very early on that one, I definitely encourage starting a doc.

shaneutt commented 1 year ago

Since this has been active over the past few months and appears to have a couple of interested parties, we're definitely interested in having a community member jump in and iron out the details.

/help

That said this doesn't appear to be something we need to block the GA release for, so it may be something that needs to wait until after GA.

/priority backlog

k8s-ci-robot commented 1 year ago

@shaneutt: This request has been marked as needing help from a contributor.

Guidelines

Please ensure that the issue body includes answers to the following questions:

For more details on the requirements of such an issue, please see here and ensure that they are met.

If this request no longer meets these requirements, the label can be removed by commenting with the /remove-help command.

In response to [this](https://github.com/kubernetes-sigs/gateway-api/issues/1494): >Since this has been active over the past few months and appears to have a couple of interested parties, we're definitely interested in having a community member jump in and iron out the details. > >/help > >That said this doesn't appear to be something we need to block the GA release for, so it may be something that needs to wait until after GA. > >/priority backlog Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
alrz commented 12 months ago

Since this has been active over the past few months and appears to have a couple of interested parties

That said this doesn't appear to be something we need to block the GA release

I found this while looking for a GW/Mesh/Auth solution.. To me Mesh is optional but you'd definitely need Auth. I didn't understand why this is not a part of core feature set considering Mesh was just announced.

In the meantime, what are our options to implement something like this on top of existing concepts?

youngnick commented 12 months ago

Hi @alrz, many dataplanes support doing some sort of auth, so it's not that dataplanes will need to build support, it's just that we need to take the time to figure out the right API desing.

In the meantime, I'd recommend checking out the docs for your implementation of choice and seeing if they have a custom way to do this at the moment. Sorry to not be able to provide more here.

EternalDeiwos commented 11 months ago

It is a real shame that #532 was closed. With something like that we could have a baseline for arbitrarily complex authentication and authorization because it can all be handed off to some external service without needing to modify the gateway impl. or the downstream service.

It seems to be increasingly common for downstream services to leave authentication to the operators, Apache Iceberg comes to mind where they provide the functionality for their RESTCatalog (a Java Servlet) but authentication and access control are BYO-.

youngnick commented 11 months ago

I think that #532 had some issues, particularly with how broadly implementable it would be. One thing that we've also found as we've worked on both Ingress and Gateway API is that having ways to express arbitrarily complex things can make portability (which is a primary goal of this project) difficult to impossible.

I totally agree that it would be amazing to have a standardized, conformance tested method for authentication included in Gateway API, but I think we have a fair amount of work to do to ensure that we get there in a way that is portable, extensible and expressive. (Cilium's issue - cilium/cilium#23797 - about this is pretty popular, and I would love to have a way to do this in upstream as well!)

I encourage anyone interested in building this support to check out our GEP process and start the discussion process that is a prelude to opening a new GEP.

EternalDeiwos commented 11 months ago

Would passing the responsibility to an external service not enable arbitrarily complex access controls without the need to express them directly within the Gateway API itself?

The API would still need some way to refer to an external service and map the input/output of the authentication request, however the API already contains similar configuration for the latter from other filters. The former is also discussed in the PR but looks like without reaching consensus.

Iā€™m not deeply familiar with the inner workings of either project (this and Cilium) but use both. @youngnick could you comment on what else is missing from #532? Or what questions remain unanswered for that approach? Discussion on the PR itself is fragmented and seems to stop quite abruptly.

howardjohn commented 11 months ago

Would passing the responsibility to an external service not enable arbitrarily complex access controls without the need to express them directly within the Gateway API itself?

Unless the thing you want to express is access controls directly in the gateway :slightly_smiling_face: . External auth comes at a huge cost. And even then, the interface between the gateway and external service needs to be defined, as well as the user facing API to configure it.

EternalDeiwos commented 11 months ago

It is a lot harder to make a comprehensive API for expressing access controls in the gateway than it is to make the necessary configuration to delegate to an external service. From what was said before, my understanding is it is unlikely an API for access control would allow for arbitrary complexity and still maintain the necessary portability and extensibility.

That said, there are many valid situations where that kind of arbitrary complexity are necessary for access controls. Even if there were a portable, extensible, and expressive API for access control in the gateway, if that API was insufficient then an external solution would be necessary nonetheless.

What about having two solutions to this? Delegating to an external service is expensive but it is likely the necessary interfaces and a portable, extensible, and expressive API can be devised relatively easily. As a single solution I can understand this would be undesirable but it could later joined by the ability to handle access control within the gateway as a separate API? Both together could enable comprehensive access control capability.

It will take a lot of work to build an API for access control within the gateway, and then it will likely be something that satisfies the majority of common access control mechanisms but still insufficient for more complex requirements. Delegating access control to an external service would be easier to build in the meanwhile and could serve as something good enough while we reach consensus for the other mechanism. Even afterward, it would still have an important place serving those use cases that can be handled by the other API.

guicassolato commented 11 months ago

Throughout the past couple years, since when @jmprusi first proposed #532, we too at Kuadrant have been acquiring some experience with making the gateway to call an external service, for the purpose of enforcing external auth and rate-limiting, out of reading Gateway API resources.

One of the things we learnt is that triggering the request to an external service on matching specific route rules is not the trickiest part of the problem. There are definitely still room for improvement regarding that, given the current APIs that the implementations we've been focused on provide us with in the internals, and I believe a generic "call an external service and wait for it to flag YES or NO on the traffic flowing through this route/route rule" kind of filter could be useful.

However, a capability that would really make a difference to implement external auth (at least for us) is the setting of arbitrary additional metadata to be included in the request that is sent to the external service, on a per route rule basis. Apart from attributes of the specific route/route rule that triggered the request to the external service, we'd be looking for some kind of key-value pair that could be defined for each of those rules and make part of the external request.

We've looked into header modifiers but other than being too HTTP specific, it never felt right having to modify a request that ultimately may hit the backend service, only for the purpose of passing metadata for an inner request to an external service. IOW, should not mix context and metadata. It's also more spec for route owners to have to write and unravel.

@jmprusi developed more on the concept and implementation concerns of having a metadata field as part of the filter API at https://github.com/kubernetes-sigs/gateway-api/pull/532#issuecomment-776916334 and https://github.com/kubernetes-sigs/gateway-api/pull/532#discussion_r592530612.

Meanwhile, Kuadrant has been focusing more on Policy Attachment (GEP-713). Here's an example of Kuadrant external auth for Gateway API based on Policy Attachment: https://gist.github.com/guicassolato/7dc98df842a89657050514d31daadaa3

kflynn commented 5 months ago

At KubeCon in Paris, there was discussion of finally addressing this as a goal for Gateway API 1.2. I think this is a great idea, and I'd like to suggest that we do it by supporting the Envoy ext_authz protocol. (This came up with a couple of users at KubeCon, too, but I'm sad to report that I can't remember their names. šŸ˜±) and yes, I'll sign up to write the GEP for it. šŸ™‚

Before anyone starts panicking about formalizing support for an Envoy protocol, ext_authz is not an XDS-like protocol. šŸ™‚ It is much simpler to implement and use; any Envoy-based Gateway controller can already use it, of course, but non-Envoy-based implementations should not have trouble using it either.

There are two variants: HTTP and gRPC. The gRPC variant is simplest:

For HTTP variant:

In both cases:

That's basically it. There're years of prior art demonstrating using this: it's very simple while still allowing for real-world auth flows, and I think it's a useful approach here.

guicassolato commented 5 months ago

As much as I like policies, having dedicated the last couple years contributing to Authorino (a Kuadrant component), I could not be more in favour of this. Excellent initiative @kflynn!

I'd be glad to chime in should you need an extra pair of hands.

arkodg commented 5 months ago

Some more prior art

haproxy-ingress gloo traefik ambassador nginx envoy contour istio envoy-gateway

Andrei-Predoiu commented 4 months ago

Hi, @kflynn I was one of the ones you talked to at KubeCon Paris. We've been using Emissary gateway in production for 5 years or so. I stand for this idea.

The gateway is the entry point to the cluster. In basically every situation, a gateway(or ingress) will have some auth interconnected. The above proposal allows for a simple and customizable solution that will support basically every centralized auth use case out there. Auth services can respond with redirects to perform oauth flows. Downstream header modification allow for easy phantom token implementation. Can replace many existing kinds of filters currently present in the various gateway api implementations.

rshriram commented 4 months ago

I would recommend not pinning down a specific protocol. Rather focus on the broader semantics. Envoy has two actually. ext_proc is much more powerful and structured, while ext_authz has a mix of http and grpc implementations.

joebowbeer commented 3 months ago

Updating prior art links for envoy-gateway

https://gateway.envoyproxy.io/v1.0.1/tasks/security/ext-auth/ https://gateway.envoyproxy.io/v1.0.1/api/extension_types/#extauth

youngnick commented 2 weeks ago

In the interest of getting some movement on this GEP, I'm going to lay out what I believe the next steps to be, so that we can tag these next steps with good-first-issue.

This GEP needs a start; there's agreement on this issue that we should proceed with something, so we need to kick off the process. The best way to do that is with a Provisional PR that fills out the What, the Who, and the Why of the GEP, along with background information (which folks have very helpfully linked a lot of here).

So, the next steps for this GEP (which can proceed at any time and is not affected by the Gateway API release cycle) are:

The purpose of this initial Provisional update is to ensure that everyone talking about Auth in Gateway API has the same understanding of the current state of the art around configuring and using Auth in both Gateway API implementations and their underlying data plane proxies. This will mean that, at a future date, we can look at doing the further work to push this GEP to Implementable and Experimental (which will be subject to the usual planning cycle and freeze periods). Up until the Provisional state is finished, though, all updates to this GEP document will not be covered by Gateway API change freezes.

Lastly, whoever does take this on should not feel obligated to push this feature all the way to Experimental or beyond! It's totally fine to come in and do the initial background and Introduction for the GEP and then move on to something else. Of course, if you're passionate about the feature and want to push it forward, that's how features make it into Standard eventually!

Marking as good-first-issue with this todo list.

/good-first-issue

k8s-ci-robot commented 2 weeks ago

@youngnick: This request has been marked as suitable for new contributors.

Guidelines

Please ensure that the issue body includes answers to the following questions:

For more details on the requirements of such an issue, please see here and ensure that they are met.

If this request no longer meets these requirements, the label can be removed by commenting with the /remove-good-first-issue command.

In response to [this](https://github.com/kubernetes-sigs/gateway-api/issues/1494): >In the interest of getting some movement on this GEP, I'm going to lay out what I believe the next steps to be, so that we can tag these next steps with `good-first-issue`. > >This GEP needs a start; there's agreement on this issue that we should proceed with _something_, so we need to kick off the process. The best way to do that is with a Provisional PR that fills out the What, the Who, and the Why of the GEP, along with background information (which folks have very helpfully linked a lot of here). > >So, the next steps for this GEP (which can proceed at _any time_ and is not affected by the Gateway API release cycle) are: > >* Someone requests to have the work assigned to them on this issue (I've removed the current assignee because there have been no updates for some time). >* That person generates a new PR to the `geps/` directory in the repo. This PR must do a few things: > * copy the template in the `gep-696` directory to a new `gep-1494` directory > * update all references to GEP-696 to GEP-1494, in both the Markdown and YAML files in there > * Mark this new GEP as the `Provisional` state, both in the Markdown and the YAML files > * Fill out the title (`Auth in Gateway API`), TLDR, Goals, Non-Goals and Introduction sections of the GEP _only_. Other sections can be left as they are in the template or filled out with "To be completed later" or similar. Note that I've specifically left the title as `Auth` here because I think that we'll need Authentication for sure, but we may need to also support some Authorization as part of this GEP. Without this initial step though, none of us know exactly what we do need. > * The most important part of this update is the Introduction. This section should explain what authentication and authorization mean in the context of Gateway API configuration, who needs it, why it's useful to be able to configure it at the Gateway API level, and, even more importantly, explain what each relevant data plane (proxy) does to configure Authentication and/or Authorization today. See [GEP-1742](https://gateway-api.sigs.k8s.io/geps/gep-1742/) for a similar example, although this GEP should also include reviewing the background information supplied above about what various implementations support and how it's configured. > >The purpose of this initial Provisional update is to ensure that everyone talking about Auth in Gateway API has the same understanding of the current state of the art around configuring and using Auth in both Gateway API implementations and their underlying data plane proxies. This will mean that, at a future date, we can look at doing the further work to push this GEP to Implementable and Experimental (which _will_ be subject to the usual planning cycle and freeze periods). Up until the Provisional state is finished, though, all updates to this GEP document will _not_ be covered by Gateway API change freezes. > >Lastly, whoever does take this on should _not_ feel obligated to push this feature all the way to Experimental or beyond! It's totally fine to come in and do the initial background and Introduction for the GEP and then move on to something else. Of course, if you're passionate about the feature and _want_ to push it forward, that's how features make it into Standard eventually! > >Marking as good-first-issue with this todo list. > >/good-first-issue Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.
jgao1025 commented 2 weeks ago

/assign @jgao1025