zooniverse / front-end-monorepo

A rebuild of the front-end for zooniverse.org
https://www.zooniverse.org
Apache License 2.0
105 stars 30 forks source link

Investigation: do we still need Panoptes JavaScript Client (PJC)? #5995

Open shaunanoordin opened 6 months ago

shaunanoordin commented 6 months ago

Investigation

Question: do we still need the old panoptes-javascript-client (aka PJC) in FEM? If so, why, and where?

Context:

Additional reading (for nerds):

Current PJC Use

The following indicates where PJC is still used in the FEM code, as of 21 Mar 2024.

app-content-pages

app-project

(You know, I think I'm starting to detect a pattern here.)

app-root

lib-async-states: no PJC here

lib-classifier

lib-grommet-theme: no PJC here

lib-panoptes-js: no PJC here, obviously. This is Panoptes.js!

lib-react-components

lib-user

tools-standard: no PJC here

Analysis

The following auth functions exist in PJC, but don't (yet?) exist in Panoptes.js. See PJC's auth.js for the full code.

Compare with Panoptes.js's auth.js code, which only has...

oauth is only directly used as "authClient"s passed down to packages in dev servers. Mark's work with lib-user seems to indicate we may be able to replace PJC's oauth with PJC's auth.

I have no idea what to make of the sugarClient. 🤷‍♂️

We now present Shaun arguing with himself about oAuth:

Status

Main investigation complete. Awaiting discussion, to plan further steps for Panoptes.js.

eatyourgreens commented 6 months ago

Roger wrote some docs for oauth.js, and I extended them to include auth.js (or, at least, the parts of auth.js that handle bearer tokens and refresh tokens.) https://zooniverse.github.io/panoptes-javascript-client/

The important thing to remember about those clients is that they are stateful. When you import auth from 'panoptes-client/lib/auth.js', you aren't just importing the client, you're also importing the current user and their OAuth tokens (both the refresh token and access/bearer token), which are stored in the client. This gets messy if you already have a user state store, as you now have user state stored in the client and in your own store, and the two can fall out-of-step. That's the source of most monorepo auth bugs.

eatyourgreens commented 6 months ago

Also, if it's useful at all: Panoptes access tokens are good for two hours, so make sure you have a refreshed token before making a credentialled request. AFAIK Panoptes refresh tokens never expire (maybe double-check with the backend team about this) but Panoptes session cookies expire after two weeks of inactivity. You can check the session cookie lifetime in browser dev tools.

Frontend auth for a typical session looks something like this:

  1. const user = await auth.checkCurrent(): exchange your Panoptes session cookie for an access token and refresh token. Run this once on page load to set up a Panoptes session in the current tab. The client will store the Panoptes user object, the access token and the refresh token in its internal state. The session cookie is a Secure, HttpOnly cookie, so this is probably the most secure way to authenticate a Zooniverse user from a browser. The client's internal state isn't exposed via any public interfaces, so should be secure from third-party scripts also running on the page (maybe? JS has no real concept of private variables.)
  2. auth.signIn(): If you don't have a session cookie, you can sign in with auth.signIn(), using the OAuth password grant flow. However, the password flow is disallowed by OAuth 2.1, as it makes it quite easy to steal passwords.
  3. const token = await auth.checkBearerToken(): use the OAuth refresh token flow to get an access token in order to perform a credentialled request eg. making a classification or reading your inbox. This will always return a refreshed access token, so there shouldn't be any need to worry about having expired credentials, unless you hang on to this token and try to reuse it after it's expired. Try to avoid saving these tokens in local component state. The auth client will manage them for you.

    Panoptes.js doesn't have an auth client at all, just some helper functions that can decode an access token, which is a JWT, and return the user and permission scopes that are encoded inside. You still need to use an OAuth client, of some sort, in order to get that token from Panoptes OAuth.

eatyourgreens commented 6 months ago

If you're new to OAuth, I highly recommend watching Kim Maida's The Art of Authentication before jumping into any Panoptes auth code. That talk really clarified Panoptes OAuth for me.

Refresh Tokens: What are they and when to use them is also a useful read.

eatyourgreens commented 6 months ago

Point is, we don't need to build oAuth in Panoptes.JS, because nobody uses it.

Zooniverse Classrooms use oauth.js. However, that library uses the implicit grant flow, which is also deprecated and not recommended for use any more.

eatyourgreens commented 6 months ago

Panoptes auth uses access tokens to encode both user info (identity) and scopes (access permissions.) That's not really recommended any more. A modern approach, still using OAuth, would be to put user identities into ID tokens and have the access token only responsible for limiting access to protected resources. Maybe something to bear in mind for things like Zooniverse user groups and Zooniverse user certificates? The whole Panoptes auth implementation in the frontend is something like ten years old now.

What's an ID token?

As the name may suggest, an ID token is an artifact that client applications can use to consume the identity of a user. For example, the ID token can contain information about the name, email, and profile picture of a user. As such, client applications can use the ID token to build a user profile to personalize the user experience.

An authentication server that conforms to the OpenID Connect (OIDC) protocol to implement the authentication process issues its clients an ID token whenever a user logs in. The consumers of ID tokens are mainly client applications such as Single-Page Applications (SPAs) and mobile applications. They are the intended audience.

What's an access token?

When a user logins in, the authorization server issues an access token, which is an artifact that client applications can use to make secure calls to an API server. When a client application needs to access protected resources on a server on behalf of a user, the access token lets the client signal to the server that it has received authorization by the user to perform certain tasks or access certain resources.

https://auth0.com/blog/refresh-tokens-what-are-they-and-when-to-use-them/

eatyourgreens commented 6 months ago

@mcbouslog this issue, from 2018, raises the same issues as you are interested in, I think, so could be worth reading.

Specifically, there's a bunch of repeated code in the JS API client for token storage and management (both refresh tokens and access tokens):

This library needs an overhaul in how it manages and re-uses tokens and this code can be shared between the supported authentication flows we currently use:

  • Auth.js uses the password credentials flow and gets a refresh token
  • Oauth.js uses the implicit flow and does not get a refresh token

We should split out the token retrieval to different strategies patterns and consolidate the token storage and management code where we can. The different token flows above will require different token renewal strategies as well:

  • flows with a refresh token can get a new access token directly
  • flows without a refresh token will have to re-authenticate or rely on an existing session to gain a new token.

The updated code will also provide management hooks that the including app can use to configure and customize the authentication flows as well. This will allow events like failing to refresh a token to bubble up to the calling app to better manage the experience of our users on sites that use implicit flows.

There's also an open request to support the authorization code flow (for an old PRN server app), which would also be useful now that password grant is disallowed and implicit grant is discouraged.

The Python API client already supports authorization codes for server-side authentication eg. I use an authorization code here to build new subjects for the SLSN project. As a user of both the JS and Python API clients, it would be useful to have parity between them.

mcbouslog commented 6 months ago

Thank you @eatyourgreens ! This comment and links, as well as related Slack posts are very helpful. It's looking the panoptes-javascript-client auths at minimum and likely auth in general could use an overhaul.

eatyourgreens commented 6 months ago

Also worth noting that none of the code discussed here uses fetch, which has been the standard in browsers for years and is supported in Node since Node 18.

See https://github.com/zooniverse/front-end-monorepo/issues/317

eatyourgreens commented 6 months ago

Also noting that the Next.js app router expects data-fetching to use fetch. So data caching might not work with superagent.

goplayoutside3 commented 2 months ago

Linking a couple of relevant convos for future consideration of PJC in our frontend:

Superagent and SWR: https://github.com/zooniverse/front-end-monorepo/pull/6117#discussion_r1632112511

Talk board visibility to admins: https://zooniverse.slack.com/archives/C06DCM0V9/p1720463383130629