cryostatio / cryostat-legacy

OUTDATED - See the new repository below! -- Secure JDK Flight Recorder management for containerized JVMs
https://github.com/cryostatio/cryostat
Other
224 stars 31 forks source link

OAuth Redirect Authentication flow #717

Closed andrewazores closed 2 years ago

andrewazores commented 3 years ago

https://vertx.io/docs/vertx-auth-oauth2/java/

Vert.x supports OAuth redirect authentication flow. Cryostat could support this when using the OpenShiftAuthManager, or any other AuthManager implementation that delegates to an OAuth server. This way the user would not supply their Bearer token directly to Cryostat. Instead, the user would visit the Cryostat web-client and be redirected to the platform OAuth server, allowing the user to log in with whatever credentials that server is configured to require (OpenShift cluster username/password, LDAP, or some other SSO). After successful authentication the user would be redirected back to the web-client with their access token as a query parameter, which the web-client should be able to capture and store as usual.

jan-law commented 3 years ago

After doing some research, I have a few questions about whether I’m approaching this issue correctly. The OpenShift OAuth server docs mention that it supports both the “authorization code grant and the implicit grant OAuth authorization flows”. I found another OAuth site explaining both grant types. It says Authorization Code Flow is used by "web apps executing on a server" and Implicit Flow is used for "SPAs executing on the user's browser".

Which grant type applies to our case?

The docs mention the “client” asks the OAuth server for access to the “protected resource”. I’m assuming the resource is the user’s Bearer token. Who is the “client” in this case? The web client or the cryostat backend?

Lastly, am I understanding this authorization flow correctly? 1) User visits cryostat 2) Client redirects user to openshift auth server login page <namespace_route>/oauth/authorize 3) User enters login credentials 4) Depending on grant type, the Auth server sends an authorization code or a token as part of the URI fragment to the client 5) (Authorization code grant type only) Client posts authorization code to <namespace_route>/oauth/token and auth server sends an access token to the client 6) Client saves token 7) User is redirected to Cryostat dashboard

andrewazores commented 3 years ago

From that description the "Implicit Flow" seems to match best - from the OAuth server's perspective, the requesting client is the cryostat-web instance running in a user's browser, and not the cryostat backend. The backend is just acting as a proxy.

The "protected resource" would not be the user's general access token, I don't think. It is most likely that the resource referred to is a specific API resource, similar to what I am doing in #718, where a resource would be a recording .jfr file, or a report .html file. In the example given, the "resource" according to the Google OAuth server could be "the user's gmail account". It may be possible to consider the OpenShift API as a whole to be "resource", but I think this is more likely to be something we would have to specify with the scope parameter. I need to do some more reading here to be sure that this will work as intended with our existing hooks into the OpenShift RBAC permissions system. For now, I would leave the scope empty.

Your authorization flow steps are basically correct, if you consider that the "client" is the user interacting via the web-client. The Cryostat backend should be essentially transparent in the auth flow since it is only acting as a proxy, and won't be storing the user's authorization code or access token at any point.

When the web-client connects to Cryostat and does the initial POST /api/v2.1/auth with no credentials or authorization, the response should cause the browser to redirect to the OpenShift auth server at /authorize. This could probably be accomplished by the AuthPostHandler setting a HTTP 302 status, just like the OAuth server will do. The redirect Location header value URL sent to the browser should include the response_mode=fragment Implicit Grant parameter, response_type=token, a client_id identifying Cryostat, and a redirect_uri back to the web-client instance (ex. https://cryostat-sample-default.apps-crc.testing).

This will cause the browser to redirect to the OAuth server, perform their auth if needed (they may still have an active session with the OAuth server in which case the login is skipped and the existing session used), and then the browser will receive another 302 response from OAuth. This time, the Location header will be the $redirect_uri#access_token=abcd1234. This will cause the browser to navigate back and reload the web-client.

Now that the web-client has been reloaded, it should check for the access_token URL fragment (in fact, it did this the first time before issuing the POST /api/v2.1/auth, but the URL fragment was empty/null). Since it observes that it does have an access_token fragment at startup, it should assume that the authMethod is Bearer and take the access_token as the Bearer Token. It should then perform another POST /api/v2.1/auth including these parameters/headers. The Cryostat backend would then respond with a 200 and UserInfo response. The web-client flow resumes as usual from this point.

My only outstanding question here is whether the access_token we get back from this /oauth/authorize flow will actually be the token that we expect - ie will it be equivalent to the OpenShift account token you have in oc whoami -t, or will it be a more restricted token depending on the scope parameter we initially provide. If the scope parameter is significant here then I suppose we need a way to have the AuthManager implementation generate a string representing all of the possible platform-mapped permissions that the user might need. This would defeat some of the purpose of our fine-grained RBAC support however, where we support cases where a user may have a token that allows them to only perform certain actions and not others. If the user is required to pass an authorization check for all permissions when they first receive their OAuth token then it won't be possible for end user administrators to configure user accounts for reduced permissions, or else they will simply be unable to authorize and access Cryostat.

andrewazores commented 3 years ago

Now that I have thought it through some more and written all this down, I don't really think we need to use vertx's oauth client library that I linked. That would be useful if Cryostat were running as a server-side application and creating user sessions, holding state, and storing the user's authorization code. Since we are just treating Cryostat as an authorization server proxy and doing all of the session and state stuff on the user's browser, we don't have any need for the server-side OAuth client.

jan-law commented 3 years ago

That makes sense, especially since we already have RBAC support. Should we close this issue?

andrewazores commented 3 years ago

I think the issue can remain open - there is still some work to do on the AuthPostHandler so that it can send the proper redirect to the client. This should obviously only happen when the AuthManager is one where that makes sense (ie OpenShiftAuthManager), so there will also need to be some work done on the AuthManager and its implementations to sort out how to handle that redirect flow.

jan-law commented 3 years ago

I think the access_token referred to in the docs above is a Bearer token (source). If I use oc to manually discover and visit their oauth/token/request URL, I get the following:

$ oc get route oauth-openshift -n openshift-authentication -o json | jq .spec.host
"oauth-openshift.apps.ci-ln-i8f6izb-f76d1.origin-ci-int-gce.dev.openshift.com"

Manually entering https://oauth-openshift.apps.ci-ln-i8f6izb-f76d1.origin-ci-int-gce.dev.openshift.com/oauth/token/request in a browser tab brings you to the cluster login screen: image

After logging in, clicking “Display Token” shows a new Bearer token. This token has a different hash than the one I get from oc whoami -t. There’s also an option to request a new token: image image

To discover the token request URL programmatically, it looks like I might be able to write a GET request to https://openshift.default.svc/.well-known/oauth-authorization-server from within Cryostat. The namespace_route in the docs is the same as oauth-openshift.apps.ci-ln-i8f6izb-f76d1.origin-ci-int-gce.dev.openshift.com (source)

jan-law commented 3 years ago

After hardcoding a 302 response with Location header with the URL https://<namespace_route>/oauth/authorize?client_id=demo-client&response_type=token&response_mode=fragment, I'm able to get a rough demo working. The redirect_uri needed to be specified by registering an additional OAuth client. The OAuth server sends back a URL fragment with the access_token as well as the permissions for that token, scope=user:full which I think gives users full access to the Openshift API (source). Example: https://oauth-openshift.apps.ci-ln-6cpcdrk-72292.origin-ci-int-gce.dev.rhcloud.com/oauth/token/display#access_token=sha256~im0NjlG8nyvKBWGsrVT52bw95rym1U8X0SbngwxrCDc&expires_in=86400&scope=user%3Afull&token_type=Bearer

I have two questions: 1) The Oauth client I created is almost identical to the one in the docs, except I left the client secret blank and changed the redirect uri appropriately. Why is this demo working even if I don't specify a client secret? 2) Given that the redirects might take more time for slower connections, how could we make the login process appear smoother for users?

https://user-images.githubusercontent.com/84587295/139341934-1d1568d5-4e14-4b33-9b01-e5ef11eff173.mp4

andrewazores commented 3 years ago

Reading https://www.oauth.com/oauth2-servers/client-registration/client-id-secret/ :

From what I understand, the purpose of the client_secret is to act as a secret/password for the client to use to authenticate itself to the OAuth server, so that other clients cannot impersonate this client and request new auth tokens on its behalf. If you did not configure a secret when adding the OpenShift OAuthClient resource then I guess the secret is simply left blank, so if you do make a request to <namespace>/oauth/token without an associated secret then the request gets accepted anyway?

The actual implementation on the Cryostat side might never need to make a direct request to <namespace>/oauth/token on its own, since it does the redirect login flow and receives the token that way, in which case it doesn't need to know the secret. I'm not exactly sure of the specifics here, but when Cryostat verifies a user's permissions using their token, it goes through OpenShift's TokenReview and SelfSubjectAccessReview APIs. These probably call through to <namespace>/oauth/token at some level, but presumably a separate OAuthClient that is owned and managed by the cluster itself with its own client_secret.

The OAuthClient resource in OpenShift should be created by the Operator, which can generate and store the client secret and then use that to create the resource. If it's needed it can also supply it to Cryostat, either by mounting the secret as a volume to the Cryostat container or by populating an environment variable with it, etc.

The demo screencast looks awesome. A slow connection - in particular one with long latencies - will always suffer from this kind of SSO redirect login flow, and there isn't a whole lot we can do about it other than making sure we don't make any requests to load unnecessary resources etc. before redirecting the user away to the OAuth login. One thing we could do to smooth out the experience slightly would be to get rid of the visible Bearer auth login form. We currently need that so the user can enter their token manually, but if the token is received by an OAuth redirect then the login "form" implementation could be similar to the Noop one where it displays nothing and allows the user in as soon as it reads the token from the URL and stores that in the LoginService. (I'm not sure how we would want to differentiate between Bearer auth where the user manually supplies the token and Bearer auth where the web-client receives the token via OAuth redirect - maybe we need something other than the Authorization authMethod here)

jan-law commented 3 years ago

Here's what I've found about the scope parameter while using the service account as an OAuth client.

The OpenShift docs here mention that service accounts acting as OAuth clients have a reduced set of scopes, meaning it can't request a user:full access to all API permissions. Attempting to request scope=user:full results in error = access_denied & error_description = scope denied user:full. If I leave the scope field empty, the OAuth server assumes I requested the scope user:full and denies access.

Instead of specifying user:full, we could also request a scope with any role in the namespace. Looking at the roles from oc get roles, I tried requesting a scope with the roles scope=role:cryostat-sample:<namespace> and scope=role:cryostat-operator-role:<namespace> which both give me the same 401 error in the web client.

Any ideas about how we could reuse the existing RBAC permissions to request a token with full access to the API?

HTTP Authorization Failure caused by KubernetesClientException: 
Failure executing: POST at: https://172.30.0.1/apis/authorization.k8s.io/v1/selfsubjectaccessreviews. 
Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. 
selfsubjectaccessreviews.authorization.k8s.io is forbidden: User "kube:admin" cannot create resource 
"selfsubjectaccessreviews" in API group "authorization.k8s.io" at the cluster scope: scopes [role:cryostat-sample:cryostat-
operator-system] prevent this action.
andrewazores commented 3 years ago

The operator's config/rbac/role.yaml and config/rbac/cryostat_role.yaml, which define the service account Roles (cryostat and the operator), both do include that create selfsubjectaccessreviews capability. The scope you supply does seem to map correctly to these Roles. But, you are not authenticating to OAuth as the machine service account, rather as the user account kube:admin - that makes sense, but that user isn't bound to the same Roles and in this case doesn't have any Role that gives it the selfsubjectaccessreview capability.

I don't know if the scope we provide needs to map to one of the Roles we have defined for service accounts to use. The logged in end user doesn't need to have an identical set of permissions as to what Cryostat's own service account does.

Attempting to request scope=user:full results in error = access_denied & error_description = scope denied user:full. If I leave the scope field empty, the OAuth server assumes I requested the scope user:full and denies access.

When/where does this occur? Is this in the browser after you try to log in graphically, or is this happening at some time when Cryostat is trying to make an OAuth API request using its own service account token?

What is the interaction between OAuthClient and serviceaccount? In my mind the overall model I have is that the OAuthClient resource just represented the web-client/its redirect URI/the user interacting through it, so when the user is going through the redirect login flow they are communicating directly to the OAuth server and not through a service account. They provide their credentials and the server sends them back their OAuth token for whatever scope is required. Separately, Cryostat has its own service account and token, which it can use to communicate with the same OAuthServer as its own form of limited OAuthClient, and at times it does these communications while masquerading as the user after the user has supplied their token along with some attempt to perform an authenticated action. But at the initial login stage where the user is acquiring their token from the OAuth server the Cryostat service account hasn't been involved yet, right?

jan-law commented 3 years ago

When/where does this occur?

After I login from the OpenShift Container Platform login page, the OAuth server returns the redirect URI with an error description instead of a token: https://cryostat-sample-oauth.apps.jalaw0.lab.upshift.rdu2.redhat.com/?error=access_denied&error_description=scope+denied+user%3Afull

What is the interaction between OAuthClient and serviceaccount?

Based on the description from the Implicit Grant Flow rfc, I think the “client” is our service account, “user-agent” is the web browser, “authorization server” is the OpenShift OAuth server, “resource owner” is a human user that knows their kube:admin credentials, and “web-hosted client resource” is the cryostat-sample app. In part (A), it says “the client initiates the flow by directing the resource owner’s user-agent to the authorization endpoint. The client includes its client identifier”. I think this means that OAuthClients are the only entities that can make requests to the OAuth server. As long as the service account contains a redirect URI, the service account becomes the entity making requests to the OAuth server.

When the oauth/authorize URL contains client_id=<service account name>, the OAuth server treats the service account as a valid OAuthClient and returns the HTML for the login page. If the OAuth server can’t find an OAuth client from the client_id parameter, it returns a blank HTML page and various 400 errors.

Visiting oauth/authorize with client_id=<service account name>, with a service account that does NOT contain a redirect uri returns {"error":"unauthorized_client","error_description":"The client is not authorized to request a token using this method."}

Visiting oauth/authorize and omiting the client_id parameter returns {"error":"server_error","error_description":"The authorization server encountered an unexpected condition that prevented it from fulfilling the request."}

jan-law commented 3 years ago

The logged in end user doesn't need to have an identical set of permissions as to what Cryostat's own service account does.

I realized that the role specified in a role scope is a clusterRole, not the roles from oc get roles. Role Scope docs I'll look into the clusterRoles that the operator already creates and see if there's one we can apply as a role scope

andrewazores commented 3 years ago

When/where does this occur?

After I login from the OpenShift Container Platform login page, the OAuth server returns the redirect URI with an error description instead of a token: https://cryostat-sample-oauth.apps.jalaw0.lab.upshift.rdu2.redhat.com/?error=access_denied&error_description=scope+denied+user%3Afull

What is the interaction between OAuthClient and serviceaccount?

Based on the description from the Implicit Grant Flow rfc, I think the “client” is our service account, “user-agent” is the web browser, “authorization server” is the OpenShift OAuth server, “resource owner” is a human user that knows their kube:admin credentials, and “web-hosted client resource” is the cryostat-sample app. In part (A), it says “the client initiates the flow by directing the resource owner’s user-agent to the authorization endpoint. The client includes its client identifier”. I think this means that OAuthClients are the only entities that can make requests to the OAuth server. As long as the service account contains a redirect URI, the service account becomes the entity making requests to the OAuth server.

When the oauth/authorize URL contains client_id=<service account name>, the OAuth server treats the service account as a valid OAuthClient and returns the HTML for the login page. If the OAuth server can’t find an OAuth client from the client_id parameter, it returns a blank HTML page and various 400 errors.

Visiting oauth/authorize with client_id=<service account name>, with a service account that does NOT contain a redirect uri returns {"error":"unauthorized_client","error_description":"The client is not authorized to request a token using this method."}

Visiting oauth/authorize and omiting the client_id parameter returns {"error":"server_error","error_description":"The authorization server encountered an unexpected condition that prevented it from fulfilling the request."}

Thanks, this all makes sense and your interpretation of client, user-agent, etc. sounds dead-on.

I realized that the role specified in a role scope is a clusterRole, not the roles from oc get roles. Role Scope docs I'll look into the clusterRoles that the operator already creates and see if there's one we can apply as a role scope

Sounds good - that would explain why setting the scope to a Role that should be able to do the selfsubjectaccessreviews wasn't working, I suppose.

If there are no existing cluster roles that meet our needs then I think we can discuss with @ebaron about adding a new CryostatOAuth ClusterRole. That probably makes sense to do regardless before we finish fleshing out and merging this feature, because we want to ensure that that ClusterRole only has the exact subset of permissions it really needs, and so repurposing some other ClusterRole is just asking for trouble - even if there is a role that happens to currently have the exact set of permissions by happenstance.

ebaron commented 2 years ago

I'm a bit confused about the role scope. This is so we get a token that has the permissions needed by the permissions API, right?

If I remember correctly, the token the user enters manually is only used to do a TokenReview and SelfSubjectAccessReview. In that case, would we be able to get away with just using the user:check-access scope? The TokenReview could be removed since the OAuth server should authenticate the user for us, and the check-access scope will allow us to do the SSAR.

jan-law commented 2 years ago

Here's what I know: When we request a token from the OAuth server with a clusterRole as the role scope, the OAuth server will return a token that has the same permissions as the clusterRole. Then whenever the web-UI makes an API request to the backend with that same token, that token needs to have enough permissions for the backend to fulfill any API requests, including creating recordings and this performTokenReview() function I found below.

https://github.com/cryostatio/cryostat/blob/d486871f35479b18c1a44360fe4e18697bee58ff/src/main/java/io/cryostat/net/OpenShiftAuthManager.java#L270 As of now, I've requested both TokenReview, SelfSubjectAccessReview, and user:check-access in the scope. If I remove the TokenReview create permission, the cryostat logs outputs this error below. Omitting either the SelfSubjectAccessReview or user:check-access scope also results in a similar exception.

INFO: Exception thrown
io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST at: https://172.30.0.1/apis/authentication.k8s.io/v1/tokenreviews. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. tokenreviews.authentication.k8s.io is forbidden: User "system:serviceaccount:default:cryostat-sample" cannot create resource "tokenreviews" in API group "authentication.k8s.io" at the cluster scope.
    at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:639)
    at io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:576)
    at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:543)
    at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:504)
    at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleCreate(OperationSupport.java:292)
    at io.fabric8.kubernetes.client.dsl.internal.CreateOnlyResourceOperationsImpl.create(CreateOnlyResourceOperationsImpl.java:67)
    at io.cryostat.net.OpenShiftAuthManager.performTokenReview(OpenShiftAuthManager.java:317)
    at io.cryostat.net.OpenShiftAuthManager.reviewToken(OpenShiftAuthManager.java:195)
    at io.cryostat.net.OpenShiftAuthManager.validateToken(OpenShiftAuthManager.java:167)
    at io.cryostat.net.OpenShiftAuthManager.validateHttpHeader(OpenShiftAuthManager.java:268)
    at io.cryostat.net.OpenShiftAuthManager.sendLoginRedirectIfRequired(OpenShiftAuthManager.java:142)
    at io.cryostat.net.web.http.api.v2.AuthPostHandler.handle(AuthPostHandler.java:104)
    at io.cryostat.net.web.http.api.v2.AbstractV2RequestHandler.handle(AbstractV2RequestHandler.java:117)
    at io.cryostat.net.web.http.api.v2.AbstractV2RequestHandler.handle(AbstractV2RequestHandler.java:69)
    at io.vertx.ext.web.impl.BlockingHandlerDecorator.lambda$handle$0(BlockingHandlerDecorator.java:48)
    at io.vertx.core.impl.ContextImpl.lambda$executeBlocking$2(ContextImpl.java:313)
    at io.vertx.core.impl.TaskQueue.run(TaskQueue.java:76)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
    at java.base/java.lang.Thread.run(Thread.java:829)

I'm not sure which exact permissions the backend needs to complete all of the web-UI actions. I essentially started with an empty clusterRole and added one permission at a time whenever the web-UI gave me a similar permissionDenied exception to the one below:

Request failed (401 Unauthorized)
HTTP Authorization Failure caused by OpenShiftAuthManager.PermissionDeniedException: Requesting client in namespace
 "default" cannot patch flightrecorders.operator.cryostat.io: scopes [user:check-access
 role:cryostat-operator-cryostat:default] prevent this action
andrewazores commented 2 years ago

I'm not sure which exact permissions the backend needs to complete all of the web-UI actions.

This is something we should probably make easier to determine. The permissions required for each action are defined by the API handlers themselves:

https://github.com/cryostatio/cryostat/blob/d486871f35479b18c1a44360fe4e18697bee58ff/src/main/java/io/cryostat/net/web/http/RequestHandler.java#L74

That set of ResourceActions gets mapped to platform-specific actions and resources by the AuthManager implementation before making the actual authz query to the backing auth server (OAuth in this case):

https://github.com/cryostatio/cryostat/blob/d486871f35479b18c1a44360fe4e18697bee58ff/src/main/java/io/cryostat/net/OpenShiftAuthManager.java#L166 (the two map calls for getResource() and getVerb())

We could generate a manifest of the required Cryostat application-level permissions easily enough by ex. implementing a new RequestHandler that has injected a Lazy<Set<RequestHandler>> so that it can get a reference to all of the handlers. Then, call resourceActions() on each of these handlers and perform set union on the results. This gives you the entire set of required permissions that are actually requestable through the API.

From there, if we hoist those map() calls from the OpenShiftAuthManager into the AuthManager interface so that all implementations have something similar, then we can map the set of Cryostat application permissions into actual platform-specific permissions. Apply another pass of filtering out empty permissions, since some Cryostat application permissions may translate into nothing (implicitly granted) and you have a nice platform-specific manifest of all permissions required for a user or service account to have access to all features.

ebaron commented 2 years ago

Here's what I know: When we request a token from the OAuth server with a clusterRole as the role scope, the OAuth server will return a token that has the same permissions as the clusterRole. Then whenever the web-UI makes an API request to the backend with that same token, that token needs to have enough permissions for the backend to fulfill any API requests, including creating recordings and this performTokenReview() function I found below.

https://github.com/cryostatio/cryostat/blob/d486871f35479b18c1a44360fe4e18697bee58ff/src/main/java/io/cryostat/net/OpenShiftAuthManager.java#L270

As of now, I've requested both TokenReview, SelfSubjectAccessReview, and user:check-access in the scope. If I remove the TokenReview create permission, the cryostat logs outputs this error below. Omitting either the SelfSubjectAccessReview or user:check-access scope also results in a similar exception.

INFO: Exception thrown
io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST at: https://172.30.0.1/apis/authentication.k8s.io/v1/tokenreviews. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. tokenreviews.authentication.k8s.io is forbidden: User "system:serviceaccount:default:cryostat-sample" cannot create resource "tokenreviews" in API group "authentication.k8s.io" at the cluster scope.
  at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:639)
  at io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:576)
  at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:543)
  at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:504)
  at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleCreate(OperationSupport.java:292)
  at io.fabric8.kubernetes.client.dsl.internal.CreateOnlyResourceOperationsImpl.create(CreateOnlyResourceOperationsImpl.java:67)
  at io.cryostat.net.OpenShiftAuthManager.performTokenReview(OpenShiftAuthManager.java:317)
  at io.cryostat.net.OpenShiftAuthManager.reviewToken(OpenShiftAuthManager.java:195)
  at io.cryostat.net.OpenShiftAuthManager.validateToken(OpenShiftAuthManager.java:167)
  at io.cryostat.net.OpenShiftAuthManager.validateHttpHeader(OpenShiftAuthManager.java:268)
  at io.cryostat.net.OpenShiftAuthManager.sendLoginRedirectIfRequired(OpenShiftAuthManager.java:142)
  at io.cryostat.net.web.http.api.v2.AuthPostHandler.handle(AuthPostHandler.java:104)
  at io.cryostat.net.web.http.api.v2.AbstractV2RequestHandler.handle(AbstractV2RequestHandler.java:117)
  at io.cryostat.net.web.http.api.v2.AbstractV2RequestHandler.handle(AbstractV2RequestHandler.java:69)
  at io.vertx.ext.web.impl.BlockingHandlerDecorator.lambda$handle$0(BlockingHandlerDecorator.java:48)
  at io.vertx.core.impl.ContextImpl.lambda$executeBlocking$2(ContextImpl.java:313)
  at io.vertx.core.impl.TaskQueue.run(TaskQueue.java:76)
  at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
  at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
  at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
  at java.base/java.lang.Thread.run(Thread.java:829)

I'm not sure which exact permissions the backend needs to complete all of the web-UI actions. I essentially started with an empty clusterRole and added one permission at a time whenever the web-UI gave me a similar permissionDenied exception to the one below:

Request failed (401 Unauthorized)
HTTP Authorization Failure caused by OpenShiftAuthManager.PermissionDeniedException: Requesting client in namespace
 "default" cannot patch flightrecorders.operator.cryostat.io: scopes [user:check-access
 role:cryostat-operator-cryostat:default] prevent this action

Thanks for the explanation Janelle! The OpenShift documentation about the scopes could be a bit more precise. FWIW, I seem to have found the code where the scopes are translated into RBAC objects, so we can see exactly what permissions they give: https://github.com/openshift/apiserver-library-go/blob/5cdb70a1e65b6bcabb3b897e38287ed2c8ed77d1/pkg/authorization/scope/converter.go

It looks like the user:access scope doesn't let you find out what the user can access, but what the scoped token can access. If neither, user:full or role:<whatever> are provided, then it's pretty much useless. Seems a bit odd to me, but now I understand why we need the cluster role.

Interestingly user:info grants access to user.openshift.io/v1/users/~ which returns information on the user: https://docs.openshift.com/container-platform/4.9/rest_api/user_and_group_apis/user-user-openshift-io-v1.html. We could likely replace the TokenReview with this. This would cover all the permissions in https://github.com/cryostatio/cryostat-operator/blob/e524857960ffdfe44eaa360e9fd215d14b968953/config/rbac/cryostat_role.yaml, which would mean the operator doesn't need to create a ClusterRoleBinding for each Cryostat deployment. The ClusterRole could then just be used for the purpose of the role scope.

jan-law commented 2 years ago

Then, call resourceActions() on each of these handlers and perform set union on the results. This gives you the entire set of required permissions that are actually requestable through the API.

I made a handler on jan-law:list-permissions that outputs this list of permissions: https://gist.github.com/jan-law/8bedb15a7027d2697191ea8b01dfd856

Some of the ResourceTypes are easy to match with their corresponding apiGroup, eg ResourceType RECORDINGS and operator.cryostat.io.recordings. TARGET is most likely for pods or deployments.

How do the rest of the ResourceTypes correspond to the permissions in the clusterRole? And why don't these ResourceActions refer to TokenReviews or SelfSubjectAccessReviews?

andrewazores commented 2 years ago

How do the rest of the ResourceTypes correspond to the permissions in the clusterRole

They don't necessarily - not everything that I defined as a resource from Cryostat's POV has been mapped to something in OpenShift RBAC. Not yet, anyway, and for some of them maybe not ever. There is no CRD for managing Credentials, for example, and so there's no RBAC mapping there.

And why don't these ResourceActions refer to TokenReviews or SelfSubjectAccessReviews?

TokenReviews and SelfSubjectAccessReviews are OpenShift-specific auth implementation details, so they don't show up in Cryostat's application-level model of resources/actions since we need that to be more generic - it needs to also be applicable to the BasicAuthManager for example, or for other OpenShift-RBAC-like systems that might get an integration in the future. Maybe we should have a ResourceType for something like USER though - AuthPostHandler currently has ResourceActions.NONE.

Some of the ResourceTypes are easy to match with their corresponding apiGroup, eg ResourceType RECORDINGS and operator.cryostat.io.recordings

That's what I was talking about in my previous blurb here:

From there, if we hoist those map() calls from the OpenShiftAuthManager into the AuthManager interface so that all implementations have something similar, then we can map the set of Cryostat application permissions into actual platform-specific permissions. Apply another pass of filtering out empty permissions, since some Cryostat application permissions may translate into nothing (implicitly granted) and you have a nice platform-specific manifest of all permissions required for a user or service account to have access to all features.

Pulling out that resource type/action mapping functionality from an OpenShiftAuthManager internal detail into something common to all AuthManagers might be worthwhile, but just for your purposes in your feature branch list-permissions you could hack it a bit and just force a typecast to OpenShiftAuthManager in your ApiPermissionsGetHandler and call those existing map methods directly. This will let you convert the Cryostat application-level resource/actions enum modelling into the actual RBAC permissions as the OpenShiftAuthManager understands them.

ebaron commented 2 years ago

Then, call resourceActions() on each of these handlers and perform set union on the results. This gives you the entire set of required permissions that are actually requestable through the API.

I made a handler on jan-law:list-permissions that outputs this list of permissions: https://gist.github.com/jan-law/8bedb15a7027d2697191ea8b01dfd856

Some of the ResourceTypes are easy to match with their corresponding apiGroup, eg ResourceType RECORDINGS and operator.cryostat.io.recordings. TARGET is most likely for pods or deployments.

Looks good! If you cross-reference that output with the mapping here, you'll have a complete list: https://github.com/cryostatio/cryostat/blob/05955511d9b5147ad93a5f844d6e408db68dbe98/src/main/java/io/cryostat/net/OpenShiftAuthManager.java#L312-L345

It should be some subset of CRUD on:

jan-law commented 2 years ago

Thanks! Here's the output:

        "[CRYOSTATS, PODS, DEPLOYMENTS]: create",
        "[CRYOSTATS]: create",
        "[CRYOSTATS]: delete",
        "[FLIGHTRECORDERS]: create",
        "[FLIGHTRECORDERS]: delete",
        "[FLIGHTRECORDERS]: get",
        "[FLIGHTRECORDERS]: patch",
        "[PERMISSION_NOT_REQUIRED]: create",
        "[PERMISSION_NOT_REQUIRED]: delete",
        "[PERMISSION_NOT_REQUIRED]: get",
        "[RECORDINGS]: create",
        "[RECORDINGS]: delete",
        "[RECORDINGS]: get",
        "[RECORDINGS]: patch"

When I ran Cryostat on OpenShift with the same clusterRole permissions as above, with the role scope set to user:info role:cryostat-operator-cryostat, I got PermissionDeniedExceptions for the following permissions, so I added these back into cryostat_role.yaml:

ebaron commented 2 years ago

Ah, the first two are probably from the discovery/tree API. It's capable of making the following get requests [1]:

I'm not sure where the get cryostats comes from though.

[1] https://github.com/cryostatio/cryostat/blob/05955511d9b5147ad93a5f844d6e408db68dbe98/src/main/java/io/cryostat/platform/internal/KubeApiPlatformClient.java#L366-L386

ebaron commented 2 years ago

The get cryostats comes from the Messaging Server. It checks that you have get permissions for all of the ResourceTypes.

https://github.com/cryostatio/cryostat/blob/05955511d9b5147ad93a5f844d6e408db68dbe98/src/main/java/io/cryostat/messaging/MessagingServer.java#L148-L156

jan-law commented 2 years ago

As of now, access tokens expire in 24 hours, which means if you click "Logout", any backend queries to the OAuth server will return the existing token instead of redirecting to the OpenShift Container Platform login page. Would you prefer if I made a separate PR to add a logout capability or add it to #748 ?

andrewazores commented 2 years ago

It might be easier to review as a separate follow-up PR.

ebaron commented 2 years ago

I think once this is finished, we should verify the complete workflow with both kubeadmin and regular users. There seems to be at least some difference in how authentication works between them: https://github.com/openshift/console/blob/22c6951efe7c4bca87f3f934063b9f4dcb0a4058/frontend/public/module/auth.js#L70-L80