GeoNode / geonode

GeoNode is an open source platform that facilitates the creation, sharing, and collaborative use of geospatial data.
https://geonode.org/
Other
1.43k stars 1.12k forks source link

GNIP: GeoServer A&A Improvements #2374

Closed afabiani closed 7 years ago

afabiani commented 8 years ago

Overview

This proposal aims to reuse as much as possible the GeoServer components, plugins and capabilities already in place in order to implement the A&A layer for the GeoNode resources.

This proposal aims at refactoring of the security integration between GeoServer and GeoNode reusing, where possible, available GeoServer capabilities either via the core version or via existing plugins or creating extensions that would live in the GeoServer codebase where needed. The goal is to improve the maintainability and compatibility of the integration between GeoServer and Geonode by having GeoNode rely as much as possible on standard GeoServer plugins.

The basic idea is the following one:

  1. Authentication

    The proposal is enable GeoNode to become an OpenID Connect Provider while GeoServer to become an OpenID Connect Consumer instead. The OpenID Connect protocol makes use of tokens in order to entrust the users’ identities. This would allow us to avoid using the obsolete cookies-based mechanism.

  2. Authorization

    The Authorization rules for the resources created by GeoNode must already be configured in the GeoServer Catalog, and must be associated to the users’ roles. The GeoNode Administrators do not have to configure them manually, this can be done automatically by GeoNode through the GeoFence Embedded plugin, which overrides and enhances the GeoServer Authorization subsystem and exposes a REST api to allow remote control of auth rules on the catalog. Every time a GeoNode user changes the permissions of a GeoNode Resource published in GeoServer, GeoNode should automatically update the GeoServer access rules via REST calls.

It is worth noting that the GeoNode command-line APIs should also be updated in order to synchronize and clean-up the permissions on GeoServer whenever some issue occurs and/or the Authorization rules are out of sync.

Proposed by

Alessio Fabiani (GeoSolutions) Emanuele Tajariol (GeoSolutions)

Assigned to release

None yet.

Motivation

As briefly introduced in the “Overview” the current implementation of the GeoNode/GeoServer security relies into an heavy customization of the GeoServer “Resource Access Manager”

The “geoserver-geonode-ext” (https://github.com/GeoNode/geoserver-geonode-ext) GeoNode subproject contains the current authentication and authorization implementations of the custom GeoServer modules for GeoNode. Without going into very technical details, the A&A implemented protocol currently works like below:

  1. A “GeoNodeCoockieProcessingFiler” allows GeoServer to retrieve the credentials from the HTTP Cookies generated on GeoNode side. A “GeoNodeSessionAuthToken” is generated whenever the cookie has been recognized as valid. An “AnonymousAuthToken” is returned otherwise.
  2. The “GeoNodeAuthenticationProvider” generates an in-memory “Authentication” object and injects them into the GeoServer “SecurityContextHolder” triggering the logging in of a GeoNode user which is not defined/stored into the GeoServer catalog.
  3. The “GeoNodeDataAccessManager” takes care of the Authorization of the resources. Currently it may be configured using two different types of “GeoNodeSecurityClient”s:

    a. “DefaultSecurityClient”; this is the default implementation and asks for the Resources Authorization rules to a remote HTTP-REST Service provided by GeoNode itself. The service replies with a JSON list of available Layers along with “READ-WRITE” permissions for the authenticated user.

    b. “DatabaseSecurityClient”; this is an alternative implementation which reads the permissions directly from the GeoNode DB instead of triggering the HTTP-REST Service.

The two main issues with the previous approach are the following ones:

  1. The usage of the cookie is a bit risky. It forces the GeoServer instance to live under the same GeoNode domain and this could be a very big constraint for some deployments having some different configuration. But this bond is even more sneaky. Think about a server which may be published under “www.mydomain.my” and also under “mydomain.my”. This is absolutely possible and legal, but if you want the cookie work with one server-name you must force an HTTP rewrite to that one forever.
  2. The “DefaultSecurityClient” of GeoNode asks the whole list of layers (along with their permissions) to GeoNode via HTTP-REST. Even if there is a cache implemented on the GeoServer side, this approach may be quite heavy while the number of resources grows up. On the other side the “DataBaseSecurityClient” is a good alternative only if:

    a. GeoNode has been configured by using a DB as backend

    b. All the GeoServer instances can access directly to the GeoNode DB

The solution proposed here aims at:

  1. Removing GeoNode custom implementations of the GeoServer security mechanism
  2. Avoiding usage of cookies as a mechanism to encapsulate the session
  3. Improving performance by not asking from GeoServer the whole list of accessible resources to GeoNode too frequently

    Proposal

The proposal is split into two main topics: “Authentication” and “Authorization”.

Authentication

The “Authentication” section goal is to allow GeoServer to recognize as valid a user which has been already logged into GeoNode, providing kind of an SSO mechanism between the two applications. The sole assumption we are making is that GeoServer knows and can access the GeoNode via HTTP/HTTPS. We also don’t want GeoServer making use of cookies.

Moreover, among this proposal, we intend to improve the existing GeoNode authentication mechanism, allowing it to recognize users which have been trusted by other system like Google or Facebook. In other words this proposals aims to propose the “OpenID Connect” protocol as identity recognition system between GeoNode and GeoServer instead of the cookie based in place on the current implementation.

GeoNode as OpenID Connect Provider (OP) and Relying Party (RP)

OpenID Connect is an identity framework built on OAuth 2.0 protocol which extends the authorization of OAuth 2.0 processes to implement its authentication mechanism. OpenID Connect adds a discovery mechanism allowing users to use an external trusted authority as an identity provider. From another point of view, this can be seen as a single sign on (SSO) system.

OAuth 2.0 is an authorization framework which is capable of providing a way for clients to access a resource with restricted access on behalf of the resource owner. OpenID Connect allows clients to verify the users with an authorization server based authentication.

Currently there exists several DJango extensions and apps allowing to transform a DJango based web app like GeoNode into both an OpenID Connect Consumer (or Relying Party - RP) or Provider (OP). Among the scopes of GeoNode, our interest is to allow it becoming both an OP and an RP.

As an RP, GeoNode will be able to delegate the authentication mechanism to external trusted identity providers (OPs). This would facilitate new users to create their account on GeoNode, without having to insert again their profile information on the system.

As an OP, GeoNode will be able to act as trusted identity provider, thus allowing the system working on an isolated environment and/or allow GeoNode to authenticate private users managed by the local DJango auth subsystem.

GeoServer as OpenID Connect Relying Party (RP)

GeoServer auth subsystem is based on Spring Security. The proposal is to enable GeoServer to be a OpenID Connect Relying Party (RP) through ad hoc Spring Security AuthProviders plugged into the system.

This would allow GeoServer to retrieve an end user’s identity directly from the OpenID Connect Provider (OP). With GeoNode acting as an OP, the mechanism will avoid the use of cookies relying, instead, on the OpenID Connect secure protocol.

How the OpenID Connect Protocol works:

image

  1. The relying party sends the request to the OpenID provider to authenticate the end user
  2. The OpenID provider authenticates the user
  3. The OpenID provider sends the ID token and access token to the relying party
  4. The relying party sends a request to the user info endpoint with the access token received from OpenID provider
  5. The user info endpoint returns the claims.

    GeoNode UserGroupService

Allowing GeoServer to make use of a Spring AuthProvider in order to act as an OpenID Connect RP, is not sufficient to map a user identity to its roles though.

On GeoServer side we will still need to implement a “UserRoleService” which would be able to talk to GeoNode and transform the tokens into a User Principal to be used within the GeoServer Security subsystem itself.

We can envisage two possible instances of UserRoleServices which may be used accordingly to the GeoNode administrator needs:

  1. A REST based one which would talk to GeoNode via REST to get the current User along with the list of its Roles. There is currently no such plugin in GeoServer but we could code in a way that it would be accepted as a standard extension and hence live in the GeoServer codebase.
  2. A JDBC based one, which would access directly to the GeoNode DBMS and find a User along with the list of its Roles associated to a valid authkey. If the User cannot be found given an authkey, the session is not valid. We would reuse the existing JDBC UserRoleService and configure it properly.

Solutions 1 is more generic and would work even if there is no shared database between GeoServer and GeoNode. Solution 2 should be faster to implement (i.e. it should be only configuration work) but it requires to have a shared database between GeoServer and GeoNode. Solution 1 could be in addition to Solution 2 to support more sophisticated set-up.

On GeoNode side:

  1. Each GeoNode user “userxxx” by default belongs to the GeoNode group “USERXXX” which is translated to a GeoServer role “ROLE_USERXXX”. There is no need to create such groups physically in GeoNode. It is sufficient that the GeoNode Authentication service automatically attaches the role “ROLE_USERXXX” on the responses.
  2. The GeoNode Authentication service should return the list of groups associated to the user as roles, with a prefix which may be, as an instance, “ROLE_GROUPNAME”. This is not a constraint but also a good practice for Spring Security.

    Authorization

This section focuses on the process of allowing GeoServer to associate the access permissions (READ/WRITE) for a specific Resource to the requesting end user, once the authentication has been performed as described before.

The GeoServer Authorization is based on Roles only, therefore for each authenticated user we need also to know:

  1. The Roles associated to a valid User session
  2. The access permissions associated to a GeoServer Resource

The Authentication mechanism above allows GeoServer to get information about the User and his Roles, which addresses point 1.

About point 2, the proposal is to make use of the GeoFence Embedded plugin on GeoServer and have GeoNode managing access policies via REST calls through the GeoFence REST Interface. And GeoFence is a java web application that provides an advanced authentication / authorization engine for GeoServer using the interface described in GSIP 57.

GeoFence has its own rules database for the management of Authorization rules, and overrides the standard GeoServer security management system by implementing a sophisticated Resource Access Manager.

The advantages using such plugin are multiple:

  1. The Authorizations rules have a fine granularity. The security rules are handled by GeoFence in a way similar to the iptables ones, and allow to define security constraints even on sub-regions and attributes of layers.
  2. GeoFence exposes a REST interface to its internal rule database, allowing external managers to update the security constraints programmatically
  3. GeoFence implements an internal caching mechanism which improves considerably the performances under load.

GeoFence can be run either as a standalone web application, or embedded in GeoServer. As a standalone webapp, a single GeoFence instance may handle the authorization and authentication to one or more GeoServer instances. Such GeoServer instances may be configured as a cluster or not, making no difference to GeoFence. The embedded architecture will only provide authorization services to the local GeoServer, and will use the users and roles information provided by the local GeoServer. It is possible to find more details on GeoFence installation and capabilities at the following URL: http://docs.geoserver.org/latest/en/user/community/geofence-server/index.html

GeoNode interaction with GeoFence

Our proposal is to allow GeoNode to push/manage Authorization rules to GeoServer through the GeoFence REST API acting as an administrator for GeoServer. GeoNode should be improved in order to be able to properly configure the GeoFence rules anytime it is needed, i.e. the permissions of a Resource are updated.

The same logic must also be implemented into the GeoNode command line tools “updatelayers” and “importlayers” in order to allow an Administrator to re-sync/fix the Layers’ permissions both on GeoNode and GeoServer side.

A command for recreating the whole GeoFence rules set should be implemented, in case it gets de-synchronized with GeoNode.

Clustering and caching

Having one or more instance of GeoNode being able to control a cluster of instances of GeoServer is possible with out approach, although there are a few minor nuances to describe and tackle.

The figure below represents the current implementation of GeoFence, in a configuration where the engine is embedded in a single GeoServer instance. Notice the Guava (in-memory) cache which we are using for avoiding too many round trips with the rule DB; moreover by default the Embedded GeoFence uses an embedded database for persisting the rules.

image

In an environment having multiple GeoServer instances configured as a cluster, the GeoFence databases won’t interact. This means that the calls to the REST API will update only a single instance in the cluster. Even worse, if the REST calls are balanced, the various calls can be received by the different instances, thus having the authorization rules sets completely broken in all the instances.

Shared database and pluggable Hazelcast Cache

Our proposal to address the problems above is to improve the GeoFence plugin in order to: Allow all the GeoFence instances to share the Security Rules Database using an external relational DBMS (at least PostgreSQL should be supported) Allow GeoFence to be configured with pluggable caching systems like Hazelcast in order to share the Rules Cache properly.

By following this approach any GeoServer instance in the cluster will be able to receive the REST updates from GeoNode and will update the shared DBMS. At the same time the cache will be shared among all the nodes automatically (see picture below).

image

Interaction with external clients and tools

The current authentication/authorization approach using cookies does not allow to use desktop clients or other external clients, unless we provide them with the administrator account for GeoServer (this is a simplification but it provides an acceptable picture of the situation) . As an instance, it is not possible to use QGis desktop with the GeoServer WMS unless the administrator creates specific users and roles into the GeoServer catalog which are then not linked to GeoNode users; alternative is to give away the GeoServer admin credentials which is of course not what we want.

The OpenID Connect approach presented in this proposal suffer from the same problem as it does not easily allow GeoServer to expose the Layers to external GIS Desktop clients and tools. The OpenID Connect authentication works well only if the client implements the protocol somehow but currently, to our knowledge, the most popular desktop applications are not able to use such mechanism to authenticate the requests.

It would be possible to work around this problem in different ways, all quite complex to implement. A possible solution is to allow the BASIC Authentication in GeoServer by duplicating all the GeoNode users and bridging the OpenID Connect protocol with a custom Spring AuthProvider; but, as one can imagine, this is not an optimal solution due to the intrinsic duplication.

A more sound proposal is to allow throw in the mix the GeoServer “authKey” authentication mechanism as explained here below. This would work in parallel with respect to the OpenID proposal described above. It is worth to point out that this whole proposal can be skipped if we don’t want to tackle the desktop client problem for the time being.

GeoNode/GeoServer Access Token Authentication Our proposal is to allow GeoNode to make use of a mechanism similar to the standard GeoServer Key authentication module, but using OAuth2 Bearer Access Token as auth-keys. The module allows a minimal form of authentication by appending a unique key in the URL that is used as the sole authentication token. A sample authenticated request looks like:

http://localhost:8080/geoserver/topp/wms?service=WMS&version=1.3.0&request=GetCapabilities&access_token=ef18d7e7-963b-470f-9230-c7f9de166888

Where access_token=ef18d7e7-963b-470f-9230-c7f9de166888 is associated to a specific user.

Every time a user successfully access GeoNode, an “ACCESS_TOKEN” is generated and associated to him by GeoNode itself through a specific service to be created. This key will uniquely identify the user’s session and GeoNode will take care of its validity. Every “OWS” request of a resource to GeoServer will be enriched with the “ACCESS_TOKEN” into the query string.

Whenever GeoServer will receive a request with an “ACCESS_TOKEN” attached, it will validate the token through the OAuth2 Protocol and Plugin.

The OAuth2 Plugin will rely on a “UserGroupService” (which can be based either on JDBC or HTTP-REST as there can be more than implementation accordingly to the system needs) in order to validate the “ACCESS_TOKEN-USER” couple and get back the user’s roles.

GeoNode should be enhanced to generate and manage expiring access tokens associated to the logged in users and to expose a protected REST service (hereby called Authentication) which allows to retrieve a user and role for a valid access token.

The proposed approach allows to use the same credentials everywhere. Once the user obtained an authkey from GeoNode, it can be used as WMS query parameter on every client without taking care of cookies or other session headers. The session is automatically check by the server under the hood and therefore whenever a client provides a valid access token, the user is automatically Authenticated and Authorized into GeoServer.

It is worth noting that the access tokens have a limited life and must be renewed whenever the session expires. It is the responsibility of GeoNode to check the session duration and validity.

On future improvements we could also envisage permalinks or the possibility for the Administrator of GeoNode to assign static tokens which are unique and always valid (unless manually deleted) in order to avoid changing the client configuration every time a new access tokens is generated via GeoNode login procedure.

Summary

The components which must be developed and/or improved for this proposal are summarized on the following spreadsheet:

jj0hns0n commented 7 years ago

Closing this as merged. Tested with paver setup ... docker, ansible and packaging needs to catch up. @afabiani when will we migrate to GS 2.10?

afabiani commented 7 years ago

@jj0hns0n we can envisage to easily move to GeoServer 2.10 now. With the occasion we can also better cleanup the data directory. I'm going to open an issue for that and try to work on this in the next days.