CLARIAH / clariah-plus

This is the project planning repository for the CLARIAH-PLUS project. It groups all technical documents and discussions pertaining to CLARIAH-PLUS in a central place and should facilitate findability, transparency and project planning, for the project as a whole.
9 stars 6 forks source link

CLARIAH Federated Authentication and 'homeless' user registration #9

Open proycon opened 2 years ago

proycon commented 2 years ago

We already had a discussion of this in #5 , but I think we better continue this in a dedicated issue, also to clearly keep it on the radar as an important issue to solve for CLARIAH.

The problem is how to provide access to 'homeless' users in our federated authentication infrastructure, that is to say, users that do not have an account with any participating institute. Since we piggy-back on CLARIN's infrastructure currently, we inherit their solution which is that such users can register an account at https://user.clarin.eu/user/register .

My criticism of that solution is that it requires a human-in-the-loop for account activation, so there is a delay that is often undesirable from a user perspective. For various non-sensitive services (such as the ones running in Nijmegen), immediate access after registration is desired (the user immediately gets a confirmation mail and can test the service). Of course, any sensitive service can simply not provide authorization to such 'homeless' users and only allow academic/verified ones.

@menzowindhouwer suggested we could propose this solution to CLARIN.eu . @roelandordelman expressed some concerns about 'turning this around' (but I think we may have a bit of miscommunication there?).

pautri commented 2 years ago

if there is a need for immediate access by non-academics or users outside the federation for some services, why not allow google/microsoft/yahoo as OpenID identity providers? I think there's still a benefit of having a "homeless" IdP with human-verified accounts, if only to prevent misuse. For our archive we also still have our own user registration (next to the CLARIN federation), which we do manually verify. Probably about a quarter of the user registrations are bogus accounts created by bots that manage to bypass the recaptcha.

proycon commented 2 years ago

Yes, indeed. 3rd party OpenID providers is indeed a possible solution, I have two concerns with that though:

1) I don't think we should rely on corporate big-tech players for identity management, from a privacy point of view. 2) It would require users to make an explicit choice for an identity provider (CLARIAH or some 3rd party provider), not a huge deal but it's an extra click.

Point 1 could be solved by running an own in-house non-federated identity provider, it's just that adds some extra complexity which I'd like to avoid if we make the CLARIAH/CLARIN infrastructure cater for all use cases.

proycon commented 2 years ago

Relevant remark by @menzowindhouwer:

naast de CLARIAH IdP kunnen we ook kijken of https://eduid.nl/ werkt ... en die heeft (misschien) geen manuele interventie ... wel checken of die dan wel overal bij kan ...

mmisworking commented 2 years ago

Zou het een idee zijn om bij aanvraag van een homeless users opties meet tegeven waar de user in geinteresseeerd is, dan kan het systeem de user p de hoogte stellen wat de volgende stap is en op de achtergrond sturen we de beheerder van de applicatie een bericht voor de juiste authorisatie en terugmelding naar gebruiker.

menzowindhouwer commented 2 years ago

clarin.eu/user/register heeft hier al wat velden voor: "Which language resource or service are you interested in:" en "Other language resource:" ... misschien kunnen we daar aansluiting mee zoeken ...

mmisworking commented 2 years ago

We have decided to create our own IDp service for homeless users. Our assumption is that number of registrations is low, so the priority is low. Other issues will be added which are related to this one. References to those issues will be added later.

proycon commented 2 years ago

Creating an own IDp service sounds like a good solution,

Zou het een idee zijn om bij aanvraag van een homeless users opties meet tegeven waar de user in geinteresseeerd is, dan kan het systeem de user p de hoogte stellen wat de volgende stap is en op de achtergrond sturen we de beheerder van de applicatie een bericht voor de juiste authorisatie en terugmelding naar gebruiker.

Sure, letting the user indicate what services he/she's interested in is a good idea, letting the maintainer of a service know is also a good idea though only needed if there's a need for extra authorization. Let me also reiterate, just to be sure, that the most important aspect of this whole issue is that the registration should be immediate without any human in the loop. If a homeless user has to wait for some approval he/she most likely won't bother to try out a tool. We want to be able to offer a whole range of non-sensitive services directly to all users (including homeless ones). Of course the maintainers of the services themselves decide what users to authorize (there may be good reasons for more restricted access and for a human approval stage). I think the simplest approach is to mark all 'homeless' users as being 'homeless' and clearly communicate this to all parties using CLARIAH authentication so that they can handle further authorization if necessary.

Our assumption is that number of registrations is low, so the priority is low.

I agree that the number of registrations will be fairly low, but I wouldn't consider the priority for this issue as a whole 'low' though, on the contrary, it's rather high, Nijmegen is waiting for this (for about 11 years already ;) ) as we want to deprecate our old authentication system and move towards proper CLARIAH authentication. Everything is in place to transition but CLARIAH having no ability to accommodate homeless users (without humans in the loop!) is the show-stopper that keeps us on our old system. I also understood that CLARIAH partner projects like Pure3D require functionality like this.

proycon commented 2 years ago

We had a discussion on this at the Technical Advisory Board just now and things are going another way than I had initially expected: @roelandordelman argued that a 'homeless' user's ability to register without human-in-the-loop is out of scope for CLARIAH. @tvermaut argued that it's not up to CLARIAH to provide an IdP. @menzowindhouwer argued that allowing quick homeless registrations has an impact on the whole authorization infrastructure, as we hitchhike on CLARIN's infrastructure here and many tools already operate on an assumption that only trusted/verified users have access (which we'd undermine if allowing just anyone to register unchecked).

As @menzowindhouwer earlier mentioned, eduid.nl (by SURF) provides a possible solution for homeless users, I verified that this indeed works nicely (without human-in-the-loop). However, they are not part of the CLARIN federation (which makes sense as that would run into the same arguments as stated above). So it seems, for services relying on homeless users, CLARIAH as such doesn't provide a solution but the services may give the users a choice of logging in with CLARIAH or with something like EduID (or even something like Google/FB but I strongly object against using their services because of privacy concerns).

proycon commented 2 years ago

For parties that need their own Identity Provider (e.g. for homeless users), whilst also allowing connectivity with other identity providers (like CLARIAH's or one of the big tech auth providers), @mmisworking just mentioned https://dexidp.io/ as a possible solution. This looks promising as it still allows all services to communicate with a single OpenID Connect endpoint (dex), I was worried about shifting the burden of having to support multiple OpenID Connect providers to the applications themselves.

proycon commented 1 year ago

@mmisworking If I understand correctly you were working on a solution for this after all, what is the current state?