whole-tale / wt-design-docs

MIT License
5 stars 9 forks source link

Authentication #4

Open matthewturk opened 8 years ago

matthewturk commented 8 years ago

How are authentication tokens to be passed around (@kylechard and @mbjones )

mbjones commented 8 years ago

@kylechard and @MatthewTurk I've put together some notes on compatibility between Globus Auth and the DataONE Identity services to kick start this discussion.

Token Format. Globus Auth supports using JWT Bearer tokens on REST calls, as does DataONE authentication. The definition of the Globus JWT tokens follows the OpenID Connect specification (https://openid.net/specs/openid-connect-core-1_0.html#IDToken). DataONE follows RFC 7519, and therefore uses an extremely similar set of fields, including sub, iat, and exp. OpenID Connect seems to "require" a couple more fields, including iss, aud -- these could probably be added to the DataONE auth tokens easily. Once a token has been generated, it is passed to REST services using the standard HTTP header, e.g., Authorization: Bearer <token_value>.

Token Signing. The bigger issue is probably the trust network. Currently, the DataONE Authentication service signs its JWT tokens using an X.509 certificate that is trusted by all members of the DataONE network. It delegates authentication requests to identity providers that it trusts, namely ORCID and CILogon. I'm not sure who the signing authority is for the Globus Auth tokens, but overall the infrastructure seems highly compatible, and could be completely compatible with a few surgical changes.

Representing Subjects. DataONE uses the subjects returned from ORCID and CILogon directly within strings to represent the principals in various places, including in metadata describing object ownership and access control rules that are used for authorization decisions. DataONE has decided to represent ORCID values in their URI format (e.g., http://orcid.org/0000-0003-0077-4738), and Distinguished Names (DN) from CILogon are always serialized following RFC 4514 (e.g., CN=Matthew Jones B36802,O=University of California-Santa Barbara,C=US,DC=cilogon,DC=org). I am unclear how these serializations relate to how Globus Auth represents subjects. The CILogon DNs are problematic because they add a numeric qualifier onto the cn attribute, and this is not always reliable for all identity providers. This is an edge case problem, but can cause issues in practice. Within DataONE, we also use and define three symbolic principals that can be used to represent classes of users in access control rules, including the public user for unauthenticated access, the authenticatedUser, and the verifiedUser.

Groups. DataONE maintains a list of groups, and users can create their own groups via the DataONE API using the createGroup() and similar REST methods (which is also exposed via each user's profile page on the web). Each group has a DN that can be used anywhere that a user identity can be used.

Identity mapping. DataONE has services that allow two identities to be mapped as equivalent. Within DataONE, this requires first authenticating as one identity, then requesting a mapping to a second identity, then authenticating as the second identity, and confirming the mapping. Once confirmed, the two identities are considered equivalent, and all rights of one identity are accessible to the other. This is used extensively to map historical user identities to our preferred approach of using ORCID identities. Globus Auth seems to have a similar service, and so we would need to share these identity equivalence mappings for authorization to work seamlessly across systems. This will require discussion.

DataONE Identity Management service. See identity service docs.

So, where do we go from here?

kylechard commented 8 years ago

@mbjones This is a really great write up. I think you're right that the two approaches are very much aligned. Here are a few more notes from the Globus Auth perspective.

Token Format. Globus Auth follows the OpenID Connect specification. The results of authentication are a collection of scoped access tokens (for each resource server requested). Individual access tokens are passed using a standard HTTP Auth bearer header. Token signing: Globus Auth tokens are signed by a key held by Globus. That key is trusted by all Globus services.

Representing subjects: Globus Auth identifies principals using different formats depending on the IDP. For example, CILogon ePPN (identity name) and ePTID ("provider specific id" that's guaranteed to never be reused) are treated as the unique identity. All Globus AUth identities are uniquely identified, include unique subject identifiers from the IDP and an asserted text username of the form @IDP.

Groups: Globus runs a separate Groups service that provides self-management of arbitrary groups. The Groups service lets users create groups, invite other users to the group, management memberships, implement invitation and approval workflows, and set other policies (e.g., viability, membership information requirements). All groups are uniquely identified by UUIDs and can be used for authorization.

Identity mapping: Globus supports a similar identity linking model, in which any number of identities can be linked together via authentication with each. These identities can then be presented as a set after authenticating with any one of the set.

OAuth workflow: Globus supports a typical authorization code grant in which users are redirected from a third-party application to Globus Auth (and on to registered IDPs). This flow also supports PKCE forunauthenticated clients (e.g., CLI/tools). It supports an implicit grant workflow that can be used by client-side javascript applications that cannot provide client authentication. It also supports a resource owner password credentials grant flow which allows client applications (e.g., CLI/Tools) to be used by users without needing to follow a web-based workflow.

mbjones commented 8 years ago

@kylechard Thanks. Can you add an example of a subject representation in a text string as it might be used in an access rule or other policy framework?

kylechard commented 8 years ago

When using identities in access rules we use a unique UUID that is assigned to the identity. The reason is that the usernames asserted by an IDP are sometimes reused and are therefore only guaranteed to be unique at any point in time.