DSpace / RestContract

REST Contract for DSpace 7-8
https://wiki.lyrasis.org/display/DSDOC8x/
37 stars 48 forks source link

How to manage the authentication? #10

Closed abollini closed 6 years ago

abollini commented 7 years ago

It is time to start the discussion over the client authentication in the new REST API. I see two options:

  1. we can introduce our login / logout endpoint and rely on some session cookie to manage the subsequent call (we can eventually use Spring Session instead of the native Servlet container)
  2. we can opt for OAUTH2. In this case the angular UI will be just one (the default / first) client app of our REST API. If we go for OAUTH2 we can also consider to rely on an external auth server such as CAS (https://apereo.github.io/cas/5.1.x/index.html) withdrawning all our custom implementation of authentication methods (LDAP, Shiboleth, ORCID, etc.). Otherwise we can just implement our OAUTH2 Sever embedded solution using the spring oauth support, this tutorial is relevant for our use case https://spring.io/guides/tutorials/spring-boot-oauth2/#_social_login_authserver

I like to hear other opinion on that, personally I'm in favour of OAUTH2 with or without switching to the use of an external auth server (CAS).

hardyoyo commented 7 years ago

There are a few AuthN/Z frameworks we can use, Spring Security and Apache Shiro are two options.

tomdesair commented 7 years ago

OAUTH2 is a single-sign on (SSO) framework like SAML (Shibboleth). While I think that adding support for OAUTH2 (and enabling social login options) can be a huge added value for DSpace, it should not be enabled by default (just like Shibboleth) as it requires an external API/Resource server (https://aaronparecki.com/oauth-2-simplified/) which cloud make it more difficult to install DSpace.

I think we need to make decisions on the following topics:

mwoodiupui commented 7 years ago

We are not in a position to dictate sites' choice of enterprise authN services. We need to remain adaptable.

At least one implicit mechanism (IP address) really isn't authentication at all; it provides supplemental information based on network topology but does not identify individuals. I doubt it is used the way the other mechanisms are. We need to keep that in mind.

If we focus solely on (say) OAUTH2 then DSpace will need to become an OAUTH2 provider, because we continue to need a way for unaffiliated users to identify themselves.

DSpace 6 does not use Spring Security; it continues to use the home-grown framework that it has had for years. We should look seriously at delegating this work to a more widely used framework, but we need to bear in mind that DSpace does some unusual things, such as letting every mechanism contribute to the session's identities regardless of which mechanism the user selected. (We may want to consider whether this capability is actually used.)

I want to see how annotation-based authorization works in a product which allows end users to invent new roles at any time. Much of the authorization that happens in DSpace has nothing to do with which view you request, and everything to do with the content of that view as determined by the current state of the session. [time passes] Spring Security does look promising, but I'd like to see real examples of doing what DSpace needs -- I've never seen a textbook example that didn't demand a fixed set of roles.

mwoodiupui commented 7 years ago

There is a middleman concept that tends to be left out of AAA discussions: identity. Authorization doesn't depend directly on authentication, but on established identities. Proof of possession of some token (password, certificate, approval of an external service) -- that is: authentication -- establishes an identity between the session and an individual; IP address establishes an identity between the session and a location. DSpace gives us quite a bit of flexibility in using a session's bundle of identities to authorize requested actions. But, anyway, we need to consider whether we are using the right set of concepts to model the problem. (Once again, consider the IP "authentication" mechanism.)

tomdesair commented 7 years ago

@mwoodiupui I do think DSpace 6 REST API uses Spring Security as configured here https://github.com/DSpace/DSpace/blob/dspace-6.1/dspace-rest/src/main/webapp/WEB-INF/security-applicationContext.xml But maybe not in a fully compliant way and not to its full potential (as I'm no expert on Spring Security).

It does indeed use a custom authentication provider (https://github.com/DSpace/DSpace/blob/dspace-6.1/dspace-rest/src/main/java/org/dspace/rest/authentication/DSpaceAuthenticationProvider.java#L48).

How would you like to see this changed?

I mentioned the annotation based authZ for completeness. We tried to use it on an internal project but it's indeed very difficult when dealing with "record-level" authorisations and lists. So I'm not saying we should do this, but that we should consider it. It's not impossible as you could write something like @PreAuthorize("hasPermission(#item, Constants.WRITE)") and implement a PermissionEvaluator which delegates to org.dspace.authorize.service.AuthorizeService#authorizeAction()

tdonohue commented 7 years ago

I agree with @mwoodiupui here. We cannot dictate sites' choice of enterprise authN services. Our LDAP and Shibboleth integrations are heavily used. Any authN solution we go with must continue to minimally support password auth, LDAP and Shibboleth (and on the authZ side, the IP-based integration is also quite heavily used). Adding OAuth2 as an option is a good idea, but it shouldn't be the only option (and I'm not yet convinced it's necessary to add into DSpace 7).

I think we need to do a near term cost/benefit analysis of replacing our existing (home grown, custom) AuthN/Z system with a third party solution (like Spring Security) in the DSpace 7 release. While I completely agree with that approach in the long term, I worry that is a massive change to our codebase and could severely impact our DSpace 7 timelines (if we don't have consistent, dedicated development resources to move it forward rapidly).

DSpace 7 is obviously not the end of the roadmap, so we should be concentrating on simply implementing what we need to remain backwards compatible (in terms of AuthN/Z) within the new Angular UI. But that doesn't necessarily mean we need to rebuild the entire AuthN/Z system yet.

[UPDATED TO ADD: ] However, as @tomdesair just noted above (at the same time I wrote this), it seems the DSpace 6 REST API layer uses Spring Security to communicate with our home grown, custom AuthN/Z system in the Java API. This might be an approach we could mimic in the DSpace 7 REST API (until we do a rewrite/replace of the custom AuthN/Z system).

abollini commented 7 years ago

it is not a case that the first line of one of the most voted stackoverflow question "RESTful Authentication" says "How to handle authentication in a RESTful Client-Server architecture is a matter of debate." https://stackoverflow.com/questions/319530/restful-authentication

That said I ask you to focus on the Authentication aspect, we can eventually discuss about AuthZ in another thread and probably not in the REST contract as these details will be (mostly) hidden to the client. I have opened this question in the REST contract project because whatever we decide to use for the AuthN implementation we need to provide a way for the REST client(s) to supply credentials or connect to an authenticated session. The linked stack question is a bit old but the listed options are still valid and the only alternative that we have as far as I know. I don't think that we can discuss about which realms we should support as we want to be as much backward compatible as possible (LDAP, Shibboleth, X509, etc.) but please note that this don't limit our options in any way: we can have multiple login methods/realms regardless to the use of custom endpoints or oauth2. Also in the case we opt for oauth2 with an "external" authN server we can support multiple methods etc. BTW CAS supports much more auth methods than us out-of-box and can be just embedded in the webapp or supplied in the same way that we include SOLR. For the long run I believe that will be easier for us and more effective to rely on an external wide-used and enterprise grade authN system like CAS but as this will require some extra effort and testing I will prefer to keep that out-of-scope of DSpace 7 unless someone else will strongly volunteer for that.

As said Spring makes really easy to implement an oauth2 server. It is more or less a maven dependency and some configuration) see https://spring.io/guides/tutorials/spring-boot-oauth2/#_social_login_authserver This example shows how to implement oauth2 in the same springmvc app than the rest api delegating the user authN to external oauth2 providers but they could be a LDAP lookup, db lookup, shibboleth or whatever. This mean that we can expose our existent AuthN infrastructure over a standard protocol (OAUTH2). Moreover, we can in future benefit from the additional features of OAUTH2 such as the ability to manage scopes to limit the authZ grant to the application by the user. For instance, it could be useful to allow a third part plugin to interact with the platform without the ability to invoke administrative feature on behalf of the (admin)user. As the majority of RESTful API (Twitter, Facebook, Google, Linkedin, ORCID, etc.) implements OAUTH2 my understanding is that it will be easier also for clients other than the default Angular UI and in different languages to implement such mechanism. Another example of app build with AngularJS, Spring MVC REST and OAUTH2 can be found here: http://www.baeldung.com/rest-api-spring-oauth2-angularjs

About IPAuthentication, it is not really an authentication method but just a workaround in our current model to introduce some automatism in the authZ part. It can (and must) be mainteined but it should not drive any decision about the authN part as it should be refactor in a less coupled way.

Finally, in reply to @tomdesair the only way to implement really full stateless secure application is to send the credentials with any requests so to login the user each time. All the other mechanism implement some secure token that are more or less equivalent to a session cookie. Indeed, also session cookies if linked to information stored in a sharable environment like a db, memcache, REDIS, etc. provide a solution that scale horizontaly. This can be achieved for "simple http session" using a framework like Spring Session (http://projects.spring.io/spring-session/) that allow easy switch of the underline storage http://docs.spring.io/spring-session/docs/1.3.1.RELEASE/reference/html5/guides/rest.html

abollini commented 7 years ago

just to keep the two points separate.... about the implementation my preference is to move forward to Spring Security starting from the DSpace 6 approach to wrap the authN framework and moving on to introduce as much as possible the authorization annotation approach. BTW, the OAUTH2 server implementation is part of spring security project http://projects.spring.io/spring-security-oauth/docs/oauth2.html

tdonohue commented 7 years ago

Hi @abollini : Glad to hear we are in agreement overall. Its seems I misunderstood your initial proposal. :)

I will readily admit, I'm not well versed in OAuth2 myself yet (and I need to find time to dig deeper here). But, assuming we can use OAuth2 as the authentication mechanism and still easily support (via configuration) the existing password authentication, LDAP, Shibboleth, etc, then I'm OK with this direction. That said, I'd like to better understand the extent of this change and how it affects the backend Java API (if at all), as I don't want this to negatively affect our DSpace 7 timelines. But, what you've described seems quite reasonable, and doesn't seem overly complex (at a very quick glance).

I also agree with the approach of supporting AuthN in the DSpace 7 REST API similar to how it is being supported in the current DSpace 6 REST API (by wrapping our custom AuthN framework with Spring Security).

tomdesair commented 7 years ago

I don't claim to be an OAuth 2 expert but from https://oauth.net/articles/authentication/:

OAuth 2.0 is not an authentication protocol.

but

The OAuth 2.0 authorization framework enables a third-party application to obtain limited access to an HTTP service.

So similar to Shibboleth, it would enable DSpace to request access to the user details present in an OAuth2 Identify provider (like CAS, Google or Facebook). If the user approves that access request ("grants authorization"), like in this screen, DSpace will receive an OAuth2 Access token that will enable it to fetch the user details (the "Resource") at the OAuth2 Identify provider (Facebook in case of the screenshot). That would allow DSpace to create or lookup the ePerson account and authenticate the user. A diagram of this can be found here (taken from here).

I think that OAuth 2.0 should not be used to maintain a "session" between the browser of a user and the backend and authenticate each and every REST API call a user makes (and I think that this is the point the author makes on https://oauth.net/articles/authentication/).

Stateless sessions can be implemented using JWT tokens (which in some OAuth2 implementations is also used to pass authoriZation access tokens between the app and the identify provider... confusing I know). However a secure JWT implementation is not trivial, so that is why Atmire is doing some research on this to see if it can be used in DSpace.

The advantage of stateless sessions (using JWT tokens) is that you do not need any additional backend in order to scale horizontally like memcache or REDIS to maintain session details. This would keep the installation of DSpace simple.

AlexanderS commented 7 years ago

I think the OAuth authentication has many in common with a default http session authentication, only the support for the clients may vary.

I will try to describe the possible authentication flows in my own words to ensure that we all talk about roughly the same.

The first part is always the same. The user sends a request to the authentication endpoint with username/password (in the simple case or with a valid Shibboleth session or other data). The authentication endpoint will validate the data and get the identity of the user.

The question is how to keep the identity of the user for the following requests, so that the user do not need to supply the login data in each request.

HTTP Session

HTTP Header Token (aka. OAuth)

Stateless Tokens

(Disclaimer: I do not know JWT, but I worked with a similar system with simple HMAC signatures.)

abollini commented 7 years ago

@tomdesair you have point to a great resource: https://oauth.net/articles/authentication/ oauth2 is not an authentication protocol but the goal of the article is

what we're here to talk about today is specifically authentication built on top of OAuth 2.0, what can go wrong, and how it can be made secure and delicious.

Next week I hope to be able to perform some testing and maybe to prototype an implementation to check if it works easily as expected in our scenario. BTW as you also note the openID protocol that is built on top of oauth2 use JWT to sign the token_id but the server keep also a list of released token that can be revoked by the server indipendently at any time to provide extra security. For instance JWT alone need to be used in the right (not trivial) way: https://auth0.com/blog/critical-vulnerabilities-in-json-web-token-libraries/ and without extra security (aka some state on the server) anyone that spoof a JWT token can claim to be the user

tomdesair commented 7 years ago

That's why we're still doing reseach on a session managed by JWT solution ;-) And I think it is possible to build a secure solution (based on JWT) that has security features like revocation and ip match checking while still only using stateless backend REST API nodes and NO additional configuration.

abollini commented 6 years ago

As @artlowel reported that work on that is ongoing (thanks!) I have created a ticket on JIRA https://jira.duraspace.org/browse/DS-3718

tomdesair commented 6 years ago

We have a proof-of-concept for the stateless authentication ready here: https://github.com/DSpace/DSpace/pull/1873

abollini commented 6 years ago

I'm closing this question as in the latest DSpace7 meetings it looks to me that a final consensus around the concept of stateless sessions authentication based on JWT was reach