geneontology / noctua

Graph-based modeling environment for biology, including prototype editor and services
http://noctua.geneontology.org/
BSD 3-Clause "New" or "Revised" License
36 stars 13 forks source link

Migrate from persona.org to a new authentication system #270

Closed kltm closed 7 years ago

kltm commented 8 years ago

As of right now, Persona.org is going to go away in November of this year. Migrate from persona.org to a new authentication system.

This is the Barista implementation of the migration described in geneontology/go-site#148.

It's far enough out that I've put it as a class A wishlist item.

cmungall commented 8 years ago

Note for now in lieu of full ticket: at some point we will need to allow finer-grained authorization. Not sure if it makes sense to tackle this together.

kltm commented 8 years ago

Well, depending on where we go, that may or may not be part of the same ticket. If you know exactly what type of granularity is needed, it's probably worth starting a new ticket. If it's more along the lines that we'll probably be fine with having something that's either part of or bolted into whatever we choose, it may be best to capture it here.

kltm commented 8 years ago

I talked to somebody who would know, and they said that apparently Mozilla will be unveiling a new login/auth system in the fall. This is awesome and I hope that we can use it too (at some point). However, given the track record of Persona and the fact that nothing official has been announced six months before the plug is to be pulled on Persona (although, I suspect that the direction they're going can be gleaned from here: https://wiki.mozilla.org/Identity/Persona_AAR ), I cannot advocate waiting and seeing, especially so close to the final wire. (Note: https://wiki.mozilla.org/Identity/Persona_Shutdown_Guidelines_for_Reliers , https://groups.google.com/forum/#!forum/mozilla.dev.identity .)

kltm commented 8 years ago

At BOSC, I talked a little with the very patient and helpful @acabunoc, who had a possible solution for us with using the ORCID OAuth2 service, which I was unaware of (even after searching for it specifically a while back). While I am not an OAuth2 fan, every realistic solution proposed so far uses it, so I'm going to roll with it. Moreover, ORCID is a more sensible/neutral provider than github/facebook etc, and requiring our users to have orcids make sense anyways.

The only competing login proposal that I find interesting was the (discussed offline at biohackathon 2016) possibility of using wikidata for users. While it would also be OAuth2 (so no win there), the wiki foundation is about as bullet-proof as one can get for longevity. However, that possibility was a bit shaky and could possibly run into hurdles. It's worth keeping an eye on though.

At this point, I think any resources spent on this should be towards the ORCID solution.

Pointers to get started from from @acabunoc:

https://github.com/mozillascience/PaperBadger/issues/13 https://github.com/mozillascience/PaperBadger/issues/56 https://github.com/mozillascience/PaperBadger/issues/45

kltm commented 8 years ago

Also, a question/discussion for @nathandunn for later. IIRC, webapollo has a nice flexible auth plugin system to allow various backends/providers. It might be good to model ours a bit more like that moving forward, so future changes aren't quite as traumatic. Maybe after meeting next week? We might even be able to steal/share some code...

kltm commented 8 years ago

Rereading, it seems like @DoctorBud already had this figured out: https://github.com/geneontology/go-site/issues/148#issuecomment-179591554

nathandunn commented 8 years ago

Orcid or what iplant uses (is it orcid?) would work well. We are going to try and add remote_user as well.

nathandunn commented 8 years ago

Right now we are just using web services and username / email password. Actively working on remote_user, which is just Apache / NGINX doing the authentcation and then seading a header property: https://wiki.cac.washington.edu/display/infra/About+REMOTE_USER+and+HTTP_REMOTEUSER

kltm commented 7 years ago

We'll, I've done a bunch of poking around and reading for the transition to OAuth2, and I'm remembering all the reasons that we picked Persona (then BrowserID) in the first place. Very good reasons.

First, about using the ORCID OAuth2 provider specifically (general OAuth2 kvetching below) there is the apparent fact (https://orcid.org/about/membership/comparison) that ORCID only allows a single client credential for a single member with the "Public API". That is actually a huge deal, as it means that a single user cannot secure multiple instances of Barista from their account without paying for an expensive membership, and then only 5. The workaround would be to have all Baristas share the same client credentials, which is pretty icky. Any that's not even considering that I might want to write another app that uses ORCID. This seems like a pretty serious problem, but I may be mis-reading something there.

For OAuth2 in general, the core of the issue is what we want, which is to merely identify whether or not a user is who they say they are, and to not have to worry about the process of doing that or handle their personal information; everything else is taken care of at our end--we need no access to remote APIs or information, as we store everything that we're interested locally for authorization. It's important to note that there are no security or ToS implications for a service providing this: you are just providing a thumbs up or a thumbs down to the calling client. What OAuth2 was designed for, however, was to monitor and control the use of an API in a very granular way, and this shows a lot in the additional hoops that have to be jumped through to make it work.

The first big difference is that in order for an OAuth2 provider to keep track of the clients, they need to be registered within the system and have the client then track of secrets acquired during the registration. The overhead with this varies with the provider (looked at GitHub, Google, and ORCID), but this is pretty well baked in with the providers we were considering. Naturally, unnecessary with Persona.

As well, another big necessity for OAuth2 is to have a known return url (or urls) defined more-or-less at registration time. This is to keep malicious users from hijacking credentials to gain access to a controlled API. Again, as "yes" or "no" is not a big deal with an email address, this was avoided by Persona.

The summary of this is that after the move away from Persona, all Noctua/Barista rollouts will need to be coordinated with updates to the client registration with the OAuth2 provider, which is a huge irritant from where we're coming from, where Noctua is completely self-contained as far as a new installation is concerned and anybody wanting to run Noctua never needs to contact or register with a third-party.

(There do seem to be possibly be cuts of OAuth2 that may not require these, but I couldn't find any from the providers we're interested in.)

There is likely no way that we can avoid having to deal with the heavy hand of OAuth2 for Noctua at this point, even if it is rather disruptive to deployment (keeping track of secrets, registration, etc.), but it may be worthwhile to consider other backup and transitions systems.

As a fallback that I think might be worth using (think users who have no ORCID or have forgotten their password), I was looking at passwordless.net, which has several backends (email, sms, etc.) to check to see is somebody is who they say they are, and feels rather like two-factor auth in practice.

kltm commented 7 years ago

Last musing on this. What we really want here, is an open authentication system, which is really what OpenID is. Unfortunately, it seems to be in a weird transition period and never had quite the critical uptake that OAuth 2 did (prolly cuz there was no good way to structure monetization). The New OpenID (OpenID Connect) is being built over OAuth2, so there is some hope there. However, as things stand now, it is not offered by ORCID (although often requested: http://support.orcid.org/forums/175591-orcid-ideas-forum/suggestions/6478669-provide-authentication-with-openid-connect ) and Google is in transition. (I did have much luck with GitHub, as searching for "openid github" just takes you around in circles and the obvious URLs didn't go anywhere useful.) For background, I found this to be a nice read: OpenID vs. pseudo-authentication using OAuth.

nathandunn commented 7 years ago

https://wiki.cyverse.org/wiki/display/start/Using+CyVerse+API+Clients

nathandunn commented 7 years ago

used to be iplantcollaborative . . . lots of API's for lots of things

kltm commented 7 years ago

From the very productive call with @nathandunn, @cmungall, and eric, we'll probably loosen the restriction on ORCID and explore using an abstraction layer to general use oauth2 (including google and github) and other types of providers via express middleware like passport js (heavy) and express-authentication (light). Note that both of those mentioned do not "do" ORCID out of the bag, but it could be added as a generic oauth2 provider. Also note that neither of these seem to offer a passwordless plugin out of the box (grump).

kltm commented 7 years ago

Just want to note that the plug pulling date currently seems to be November 30th, 2016 (2016-11-30).

tberardini commented 7 years ago

Joining the discussion as a watcher - just logged into TermGenie using persona and noted that deadline (Nov. 30. 2016) that @kltm cited above. (Do we need another GH ticket that's TG specific for that application? There is this ticket in the Noctua tracker and another one on the go-site tracker.)

kltm commented 7 years ago

I don't want to clutter the Noctua item with a TG discussion if possible (we can use https://github.com/geneontology/go-site/issues/148 for a more general discussion), but TG as it is now will be pretty much toast--it will function "fine", but no longer have a method for logging in. I've talked to @cmungall about this, he can fill you in on the details in either a new ticket or on the main ticket (or elsewhere).

kltm commented 7 years ago

I'm cutting it unfortunately close here. From endless manual testing, I believe that I have now gotten the functionality to be as as least as robust as the previous persona system, with many more options. That said, there is a multitude more edge cases now, so something may have slipped through.

The code should be in its final form, but there are still a few more items before closing this out.

kltm commented 7 years ago

As well, I'd like to do a dead code cleaning: remove the old templates for persona and "session", BaristaLogin.js BaristaLogout.js, etc *_session.tmpl, etc.

kltm commented 7 years ago

@DoctorBud I'm going to pull this into master now. There was a fair bit of churn in Barista, but less so elsewhere. I think I transitioned the WebPhenote-specific code correctly in noctua.js, but I may have missed something and my testing of the WebPhenote side (as well, it seems to be behind on groups). The main difference is that a "secrets" directory is now needed to populate the possible OAuth2 and local credentials (see BARISTA_LOGIN_SECRETS in the startup.yaml examples.)

The files that it will look for are:

As long as at least one is there, it will work, but you may have multiple providers. The first three define the OAuth2 for barista and look like:

---
  clientID: '123'
  clientSecret: '1234'
  callbackURL: 'http://<YOUR BARISTA LOCATION>/auth/github/callback'

The other callback endpoints are "/auth/google/callback" and "/auth/orcid/callback"

local.yaml is special, defining a username/password combo, cross-referenced to the contents of users.yaml:

-
  uri: 'http://orcid.org/0000-0001-8244-1536'
  username: foo
  password: bar

This can be used to give users access without them having to register elsewhere (meetings, demos, quick, etc.).

The rest of the changes are just trying to get that all to work right, and get a more generalized token behavior that we can modify later.

kltm commented 7 years ago

Looks like people are logging in and nothing has caught fire--all good here!

kltm commented 7 years ago

@DoctorBud For example, for GitHub, go under: Developer settings -> OAuth applications https://github.com/settings/developers

kltm commented 7 years ago

I'm leaving the email md5 mechanism in barista.js in place for the moment--it may be informative if we implement email authenitcation using auth0 passwordless authentication. Otherwise, we'd likely have authorize_by_email and uinf_by_md5, and anything depended on by them, on the block.

Note that there is also an implication here for for the users.yaml schema for go-site.