Closed fg91 closed 12 months ago
There is one important design choice I made:
There are two ways clients can authenticate with IAP. Let's say we have an ID token valid for IAP:
"authorization"
header, IAP will, if the token is valid, strip it from the request and replace it with a validated "x-goog-iap-jwt-assertion"
token (not the same token)."proxy-authorization"
header (and potentially another token as "authorization"
header), IAP will (if the ID token is valid) strip the "proxy-authorization"
header but leave alone the "authorization"
header (see docs).While the first option would be a deeper integration of IAP into Flyte (admin), I chose - as a first step - to go for option 2.
Reasoning:
Flyteadmin currently cannot interpret the "x-goog-iap-jwt-assertion"
token, even if one somehow copied it back into the "authorization"
header as it is not the same kind of token used with the current Flyte Google Identity integration (issuer doesn't match etc.).
The interpretation of this token could be added to flyteadmin but if we do so for every managed proxy service, we would potentially end up adding a lot of different cloud-provider specific logic and dependencies to flyteadmin. The "proxy-authorization"
approach is general and can work with other proxies (only requires a different external command to generate a token).
The (full) validation of the "x-goog-iap-jwt-assertion"
contains an ugly chicken-egg problem. The token is validated as follows (see docs):
// validateJWTFromComputeEngine validates a JWT found in the
// "x-goog-iap-jwt-assertion" header.
func validateJWTFromComputeEngine(w io.Writer, iapJWT, projectNumber, backendServiceID string) error {
// iapJWT := "YmFzZQ==.ZW5jb2RlZA==.and0" // req.Header.Get("X-Goog-IAP-JWT-Assertion")
// projectNumber := "123456789"
// backendServiceID := "backend-service-id"
ctx := context.Background()
aud := fmt.Sprintf("/projects/%s/global/backendServices/%s", projectNumber, backendServiceID)
payload, err := idtoken.Validate(ctx, iapJWT, aud)
The backendServiceID
is only obtained after the service has been deployed. It cannot be assigned before the deployment in IaC. Validating the audience can be omitted but is not great. So to fully validate the token one needs to e.g. 1) deploy the app without validating the audience, 2) obtain the backend service id, 3) e.g. inject this id via an env var, 4) manually restart the service so that it can then validate the audience.
Having to do this would give me quite a bit of grief.
This does not mean that when using IAP Flyte will not be aware of the users identity, the current integration with Google identities remains!
Flyte's CLIs will perform authentication twice:
"proxy-authorization"
header"authorization"
The price we pay is that every once in a while, when the tokens cannot be refreshed anymore, the users sees one browser window opening to login at IAP, followed by a second browser window opening to confirm the successful login with Flyte. The second one does not require choosing a google account anymore as the user is already logged in. This is only the case for the CLIs. For Flyte console, the second login is never observed by the user, only a single Google login screen (the IAP one) is ever seen.
Typically the refresh tokens don't expire at the same time so it should be rare that the user notices two logins at the same time.
Given that we currently cannot register or run workflows from developer notebooks at all (only from within the VPC/cluster or via port-forwarding) I find this occasional double login a very small price to pay.
This approach does not modify existing auth logic in Flyte but simply adds "a second key for a second gate in front of Flyte".
Finally, I see this as the first step in integrating Flyte with IAP in order to fulfil a need that we have. I found the "proxy-authorization"
way the better first step but this choice does not prevent a deeper integration of IAP into flyteadmin in the future.
TL;DR:
I chose to add logic that can be reused for other proxies on the client side instead of adding non-reusable logic pertaining to a specific managed cloud service to the server side.
Motivation: Why do you think this is important?
GCP Identity Aware Proxy (IAP) is a managed service that makes it easy to protect applications deployed on GCP by verifying user identity and using context to determine whether a user should be granted access.
Because requests to applications protected with IAP first have to pass IAP before they can reach the protected application, IAP provides a convenient way to implement a zero-trust access model.
(In contrast, if applications are protected using their own auth mechanism, unauthenticated requests typically first hit the application which only then redirects to e.g. a google login page. With IAP, no unauthenticated request can ever hit the application.)
Since IAP makes it very easy to implement a zero-trust model, many organizations using GCP have a security policy that any internal tool has to be protected with it.
Goal: What should the final outcome look like, ideally?
Flyte currently does not work with IAP but there is a need in the community to enable this integration:
Describe alternatives you've considered
In organizations where there is a security policy to use IAP, workarounds typically include 1) deploying flyte itself without authentication enabled and instead with IAP in front of flyteconsole and 2) port-forwarding flyteadmin's gRPC server to localhost or interacting with it only from within the cluster/the VPC (as
pyflyte
andflytectl
cannot reach flyteadmin through IAP).None of this is great.
Propose: Link/Inline OR Additional context
This issue tracks the integration of flyte with IAP, consisting of the following tickets:
https://github.com/flyteorg/flytekit/pull/1795
Adding a plugin, providing a CLI that can be used by
flytekit
(andflytectl
) as an external command to generate access tokens for IAP (see here for "external command" authentication in flyte).To create this token, the plugin performs a standard OAuth 2.0 flow with
https://accounts.google.com
(not with flyteadmin).https://github.com/flyteorg/flytekit/pull/1787
Giving
flytekit
'sRemote
(used bypyflyte
) the ability to send"proxy-authorization"
headers valid for IAP (generated with the new plugin) with every request, including the unauthenticated requests during the authentication flow with flyteadmin.In flyte's client config this will look as follows:
If a request, even one that is not yet authenticated with flyteadmin (via
"authorization"
headers), includes a valid"proxy-authorization"
header, IAP strips this"proxy-authorization"
header and forwards the request to flyteadmin without touching the"authorization"
header used by Flyte.If no valid
"proxy-authorization"
is included, the request is denied at the load balancer.This means that the existing authentication flow
flytekit
'sRemote
performs withflyteadmin
is not modified.flyteadmin
itself is not aware that it is protected with IAP.Implementing the same for
flytectl
. The external command can of course be reused. https://github.com/flyteorg/flyteidl/pull/437(Fixing a bug in the flyte helm chart that breaks deployments with the GCE ingress controller (instead of nginx) as IAP only works with GCE ingresses. https://github.com/flyteorg/flyte/pull/3964) <- Not needed anymore, see reason
Adding documentation on how to deploy the flyte helm chart with a GCE ingress, GCP managed certificate, and IAP. The guide is currently documented in the
README.md
of the flytekit iap plugin added in https://github.com/flyteorg/flytekit/pull/1795 (It could be moved from there.)Are you sure this issue hasn't been raised already?
Have you read the Code of Conduct?