argilla-io / argilla

Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets
https://docs.argilla.io
Apache License 2.0
4k stars 378 forks source link

Feature - SSO keycloak #5691

Open paulbauriegel opened 6 days ago

paulbauriegel commented 6 days ago

Introduces a new SSO option using Keycloak

Enables a different SSO provider next to HuggingFace SSO

Type of change

How Has This Been Tested Local build of front-end & backend. Keycloak deployment as described in the docs

Checklist

How to test & use it

frascuchon commented 4 days ago

@paulbauriegel Thanks for this contribution. Last week I started working on a code refactoring to simplify the OAuth provider configuration, having a better integration with the social auth package. The design changes a bit with my changes. Maybe it would be nice if you could adapt yours based on this PR. If not, we can combine them later.

paulbauriegel commented 4 days ago

@frascuchon Ok, let me have a look. I will try to understand the changes.

Since you are working on the oauth, it would be nice to be able to use the roles from the oauth audience to have oauth users access specific workspaces based on those roles. I wanted to contribute this in a later part.

frascuchon commented 4 days ago

Great @paulbauriegel! I have some doubts about how to match the OAuth audience with the Argilla roles. I would love to hear your thoughts on that.

frascuchon commented 3 days ago

Hi @paulbauriegel ,

Here are docs section related to the refactor PR. It would be nice if you could take a look and give some feedback. Also, maybe can be useful to understand the refactoring approach.

paulbauriegel commented 3 days ago

Hi @paulbauriegel ,

Here are docs section related to the refactor PR. It would be nice if you could take a look and give some feedback. Also, maybe can be useful to understand the refactoring approach.

Thank you, yes I will have a look. Just to set expectations, I will only have some time later this week :-)

paulbauriegel commented 18 hours ago

@frascuchon I looked through your code. It's rather clear to integrate a new SSO. Integration of self-hosted SSO providers, such as Keycloak, where the e.g. authorization_url is dynamic based on the configuration create a small problem. E.g. if you forget to set the correct environment variables or misspell them self._authorization_endpoint = self._backend.authorization_url() resolves to None. It's hard to debug those issues without knowing social core too much, might be helpful to have some check in place there.

Generally speaking I would rather configure such settings in the oauth.yaml then via env variables, but it might be a bit more complicated now since there is a common Provider class so there are no optional extra settings that one might need for an SSO that requires more settings.

I will open a new MR based on the new code tomorrow.