webrecorder / browsertrix

Browsertrix is the hosted, high-fidelity, browser-based crawling service from Webrecorder designed to make web archiving easier and more accessible for all!
https://browsertrix.com
GNU Affero General Public License v3.0
145 stars 29 forks source link

[Feature]: SSO Support (User Creation and Login) #1490

Open fservida opened 5 months ago

fservida commented 5 months ago

Context

In large organizations it is difficult to manually manage all users that should have access to an instance, and is generally good practice to assign users to groups with permissions to SSO in the required applications. I've found some issue requests linked to this that have been closed, one of those: https://github.com/webrecorder/browsertrix-cloud/issues/244 was closed given that there is API support for adding users. However my understanding is that this still does not allow SSO and whilst it might provide admins some kind of endpoint to more easily create users, the created users are still independent, with different login credentials than what the users normally expect, which is also not ideal in lots of enterprise developments.

I've not dug too much into the current auth structure of Btrix but have already had some project where I implemented SSO directly and indirectly and can take a look at it if somebody can give me some starting point.

What change would you like to see?

As a user I'd like to be able to simply login with my institutional credentials.

As an admin I'd like to be able to add users to groups depending on their role, and have them login to Btrix with SSO through SAML/OIDC either with direct support, or through header authentication with a front proxy handling SAML/OIDC. User should be automatically created if needed and assigned to orgs automatically based on group membership.

Requirements

No response

Todo

No response

fservida commented 5 months ago

My understanding is that most of the login logic is here: https://github.com/webrecorder/browsertrix-cloud/blob/b252931c71a35f8cd2a1159935528ecd69115fe5/backend/btrixcloud/auth.py#L169C5-L229C41 I think I can quite easily work out a new endpoint (/login_sso ?) with different logic that would authenticate based on the headers passed in the request, as well as create the user and assign to Organizations depending on groups specified on the headers, as long as the request comes from a trusted proxy (else anyone could forge headers). On the login page would then need to add a button for sso login.

Implementing direct SSO is another story, but if you agree I could test and see if the above approach can be easily done.

ikreymer commented 5 months ago

Thank you for working on this, yes, would be happy to accept the initial implementation. Will leave comments in the PR.