[RFC] Schema Diffing for Authorization

rijulg commented 5 years ago

This RFC is an attempt to address the problem of implementing an ACL or Authorization layer that precedes hitting an API endpoint in an API Gateway like implementation, while also enabling to implement schema driven clients that can switch on/off functionalities based on available (authorized) data and actions exposed on the schema.

Problem

Implement ACL in GraphQL
- Although it is generally good to implement ACL in business logic layer further below the API layer, it makes it inconvenient for the clients to determine what is and is not allowed for a particular user.
- The only way for a client to determine if they have access to something or not is by hitting the end point and then processing the response.
Implementing ACL in situations where the business layer is not under control of the API
- In cases when we relay information from a service which isn't directly in our control but we would like to impose restrictions on user access
- Flexibility of applying ACL across services that have an internal ACL and those that don't. This is similar to the previous point, but is more specific to cases where we have access to the business logic layer, but we might need to add restrictions on the fly where the infrastructure to do so does not exist.
Disable endpoints from being visible to certain users.
- This doesn't necessarily have to be a security feature, but can be utilized to inform clients about accessible nodes through introspection
Simplifying developer experience by separating business logic and authorization logic
- Rather than having all business logic calls to be preceded by Authorization operations, it would be better to separate the problems and give control of authorization to a different layer that can work independently

Proposal

In the API/GraphQL layer of execution we perform the following operations:

Obtain an authorization mask from an auth provider for the user's context.
Modify the schema by applying the authorization mask.
1. Start with a blank schema.
2. For each node and parameter in the authorization mask, copy over the node from original schema if it exists.
Continue with Request Execution.

Advantages

Business logic and other services can implement or not their own authorization, as main authorization can be controlled by teams responsible for security.
1. This empowers the security team to implement best practices in a domain disjoint from developers and allows the work to be done parallely.
2. Forensics can be performed at the API layer, instead of having to drill down to potentially 3rd party APIs.
UIs can be implemented as slaves to the presented schema, modules that a particular user does not have access to can be removed from the UI by introspecting the schema.
Execution time can be minimized by eliminating requests made to data points that will end up resulting in errors about unauthorized access.
Different clients can be implemented with access to only certain parts of the business without exposing the entire business graph to them.
Does not degrade the developer experience by forcing to turn off schema introspection for hiding the unauthorized nodes.
Eliminates the possibility of graph traversal from unauthorized nodes.

Considerations

The proposal has at least the following considerations.

Performance

This mostly depends on the implementation of GraphQL as well as the logic that does the masking, in a good implementation we would apply a mask and cache the schema associated with that mask and on subsequent requests simply fetch the schema essentially removing any performance degradation.

As a benefit, nodes that are not authorized will not be executed as the request executor will not invoke the requests since they will not appear on the schema. This would essentially improve the performance by minimizing execution compared to implementations in which the requests need to be executed until the business logic layer responds with errors regarding unauthorized access.

Generation of Authorization Mask

While the authorization mask is generated we might disable certain nodes which would break the schema as we might have orphan edges on different nodes pointing to the node which has been masked out.

This can be addressed by providing mask generators with a schema validation check and subsequently masking out the orphan edges.

Unexpected responses

As a client might develop an application thinking of the entire schema being available, while their users may have limited access and receive incomplete responses.

This can be addressed by checking for null responses on the client as is the standard and from the server responding with an unauthorized access error message if needed.

Auditing malicious users vs poor clients

We will now be hitting unauthorized requests in 2 cases,

A malicious user tries to hit the nodes which have not been exposed in the presented schema
A client has hard coded requests that will try to obtain all nodes regardless of whether the accessing user is presented with all the relevant nodes in the schema or not

We can look at the problem generally as trying to distinguish a maliciously crafted request accessing deprecated nodes vs an old client accessing deprecated nodes. This problem needs to be addressed on a case-by-case basis in forensics stage where a certain amount of contextual data, not necessarily related to GraphQL's request itself will be key in differentiating between the two. As it stands, we do not have a solution to this problem.

mjmahone commented 5 years ago

The basic question is why does this need to be part of the spec itself? It seems completely reasonable to provide different schemas to different clients based on their permissions, but it's unclear to me why a GraphQL server needs a spec change to do so. I'd be really interested to see an open-source server that allowed for this schema authorization process, though. That implementation will need to consider how to handle "breaking changes" (i.e. revoking authorization), but those are all interesting questions that a specific server implementation could provide answers for.

But it seems like you can do all of that without changing the language specification at all. The specification in general does not care about authorization: you're free to implement that however you want. You probably would not want to expose a public Schema document, but the spec already has a way of getting a schema via an introspection query: you could include the permissions of a specific client in figuring out which types and fields to provide via that introspection query. While that's not how the spec implementation (graphql-js) does things, I don't think there's anything in the spec, today, preventing you from building your server this way.

rijulg commented 5 years ago

@mjmahone thanks for the feedback, I have implemented this in a project since it's not restricted by the spec (just as you said) and the library I was using did not restrict me from doing that.

However, this meant that all of the implementation of performing the diffing to present different schema got injected into my code instead of the GraphQL framework/library which would have been the proper place for it. Because with each iteration of the framework, the binding code used for modifying the schema will have to be updated as well.

The frameworks are restricted from doing anything of this sort at the moment specifically because the specification does not provide with a way to achieve this, and if libraries implement this they will be moving away from the fundamentals. Additionally, graphql.org recommends pushing Authorization logic to service/business layer; which has it's merits, but also leaves open some gaps which are solved by this approach.

As of now, this is a rather open question of where these fundamental solutions should reside. One solution would obviously be as part of the server spec, while another might be as a utility. Placing it in the server spec would mean that we inherently start enforcing all consumers to follow this pattern. However if we decide to implement this as a utility then I feel at the very least graphql.org should be updated to reflect that this is a just as valid way of implementing authorization.

nodkz commented 5 years ago

@rijulg will glad to see your material on https://graphql-rules.com (https://github.com/graphql-rules/graphql-rules/tree/master/docs/rules) we may add an additional section for authorization.

I try to gather material which out of the scope of basic spec but required for best practices. Like some peoples invent a programming language, other provide patterns ;)

graphql / graphql-spec