@auth directive can allow information leak to unauthorized users

dennisjlee commented 2 years ago

Describe the bug This was described indirectly in #636 and is also related to #195.

I wanted to file this to discuss this further, because I think there are some potential security concerns here.

We are building a B2B app. Our end users will each be employees of a Company, and we are using the Neo4j graphql authz directives to restrict users from querying data that belongs to a different company.

The authz rules correctly prevent a user from retrieving data that they are not authorized to see. However, a savvy user could still extract information that they shouldn't have access to, by distinguishing empty result sets from forbidden errors. More details below.

Type definitions

type Company @exclude(operations: [CREATE, UPDATE, DELETE]) {
    companyId: String
    name: String
}

extend type Company @auth(
  rules: [
    {
      operations: [READ],
      allow: { companyId: "$context.user.companyId" }
    }
  ]
)

To Reproduce Steps to reproduce the behavior:

Given the type definitions above, we can launch a graphql server with a snippet like this:

  const server = new ApolloServer({
    schema,
    context: _params => ({
      user: {companyId: "abc"}
    })
  });

Database can be populated with this query:

CREATE (c1:Company {companyId: "abc", name: "ABC co"}), (c1:Company {companyId: "def", name: "DEF co"})

Now consider these cases:

If the abc user issues query { companies(where: { name: "ABC co" }) { name } }, that will return data from ABC co. (ok)
If the abc user issues query { companies { name } }, that will throw a forbidden error. (ok)
If the abc user issues query { companies(where: { name: "DEF co" }) { name } }, that will throw a forbidden error. Now the abc user knows that there is a company named "DEF co" in our database, even though they can't retrieve any details.
If the abc user issues query { companies{where: { name: "EFG co" }) { name } }, that will return an empty result. Now the abc user knows that there is no company matching "EFG co" in our database.

I consider both these last two cases to be information leaks.

Expected behavior The ideal behavior from my perspective would be that a malicious user can't extract information from the system about the existence of nodes that they are not authorized to see.

I don't have a good suggestion on how to implement this though. It might not be possible to do this while still allowing users to make arbitrary where filters in their queries. For instance, if I'm querying something less unique about my Company node - say we had a country field. If I query for query { companies(where: {country: "usa"}) { name } }, I'll get a little bit of info out of that. If I get a forbidden error, that means there are other companies with country="usa" which I'm not authorized to see, but if I get a single company row back, then that means there are no other matching companies.

There are some tricky implications here. Sorry I don't have a clear suggestion of where this should go! Perhaps this should be a discussion instead of a bug report..

neo4j-team-graphql commented 2 years ago

Many thanks for raising this bug report @dennisjlee. :bug: We will now attempt to reproduce the bug based on the steps you have provided.

Please ensure that you've provided the necessary information for a minimal reproduction, including but not limited to:

Type definitions
Resolvers
Query and/or Mutation (or multiple) needed to reproduce

If you have a support agreement with Neo4j, please link this GitHub issue to a new or existing Zendesk ticket.

Thanks again! :pray:

dennisjlee commented 2 years ago

I found docs on how Neo4j handles authn / authz at the database level, and I think this describes the behavior that I expect from @neo4j/graphql: https://neo4j.com/docs/operations-manual/current/authentication-authorization/access-control/#_privileges_of_itadmin

In particular:

The results make it seem as if these nodes do not even have an ssn field. This is a key feature of the security model, that users cannot tell the difference between data that is not there, and data that is hidden using fine-grained read privileges. ... While restrictions on reading data do not result in errors and only make it appear as if the data is not there, restrictions on updating, i.e. writing to the graph will produce an appropriate error when the user attempts to perform an update they are not permitted to do.

darrellwarde commented 2 years ago

I found docs on how Neo4j handles authn / authz at the database level, and I think this describes the behavior that I expect from @neo4j/graphql: https://neo4j.com/docs/operations-manual/current/authentication-authorization/access-control/#_privileges_of_itadmin

In particular:

The results make it seem as if these nodes do not even have an ssn field. This is a key feature of the security model, that users cannot tell the difference between data that is not there, and data that is hidden using fine-grained read privileges. ... While restrictions on reading data do not result in errors and only make it appear as if the data is not there, restrictions on updating, i.e. writing to the graph will produce an appropriate error when the user attempts to perform an update they are not permitted to do.

They are, however, completely different products in that the Neo4j database is schemaless - it's very easy to hide the fact that a property even exists, compared to a GraphQL API where this is very obvious.

This is an interesting issue, and there may be some work that we can do, but don't expect that it is going to be on the same level as security in the core database, they're really targeting completely different user groups.

dennisjlee commented 2 years ago

Thanks for the response @darrellwarde. I hadn't thought of it from the perspective of hiding individual properties. I was thinking of it more like this - in GraphQL we could omit result objects that the user is not authorized to see, rather than throw a forbidden error. I think that would hopefully be easy to reason about - not sure how hard it is to implement.

So in my example above, we could potentially have this behavior instead which does not leak information to users:

If the abc user issues query { companies(where: { name: "ABC co" }) { name } }, that will return [{ name: "ABC co" }]
If the abc user issues query { companies { name } }, that will return [{ name: "ABC co" }]
If the abc user issues query { companies(where: { name: "DEF co" }) { name } }, that will return []
If the abc user issues query { companies{where: { name: "EFG co" }) { name } }, that will return []

Karthikeyan-Vijayan commented 2 years ago

This is already mentioned in the Neo4jGraphQL library docs (https://neo4j.com/docs/graphql-manual/current/troubleshooting/security/). But it is not under Auth topic

darrellwarde commented 1 year ago

Reviewing this issue, it sounds like really what you are after is the where clause in the @auth directive, which evaluates the predicates provided as filters instead of validations. If a user provides further filters in their query, they are combined with these filters as an AND, so do not risk further leakage. Given the use cases you wrote up above, I believe this satisfies all of those cases. I have updated your type definitions below with the fix:

type Company @exclude(operations: [CREATE, UPDATE, DELETE]) { companyId: String name: String }

extend type Company @auth( rules: [ { operations: [READ], where: { companyId: "$context.user.companyId" } } ] )

We are about to embark on a quite large refactor of authentication and authorization, so this should soon become a bit clearer.

Going to close this for now, but do feel free to reopen if you believe there are further issues.

neo4j / graphql

@auth directive can allow information leak to unauthorized users #1079