ardatan / graphql-tools

:wrench: Utility library for GraphQL to build, stitch and mock GraphQL schemas in the SDL-first approach
https://www.graphql-tools.com
MIT License
5.34k stars 805 forks source link

Authorization #313

Closed mxstbr closed 6 years ago

mxstbr commented 7 years ago

Authorization in GraphQL is (as noted by @helfer in this post) currently not a generally solved problem. Everybody has their own, custom solution that works for them but there's no general-purpose solution/library for it. (as far as I can tell)

I think graphql-tools is uniquely positioned to fill that hole in the community. Permissions are naturally bound to the schema of the requested data, so to me it feels like a natural fit for graphql-tools. (note that this is just an idea, wanted to kick off a discussion)

Imagine a simple type like this:

# I want posts to only be readable by logged-in users
type Post {
  id: ID
  content: String
  author: User
  notes: [String]  # I want these to only be accessible by the author
} 

extends type Query {
  post(id: ID!): Post
}

There's two types of permissions I want to enforce here. One of them I'm able to tell without running the resolvers first (only readable by the logged-in user), for the other I need the data from the database so they have to run after the resolvers. (only accessible by the author)

I could imagine an API like this working out well:

makeExecutableSchema({
  schema: schema,
  resolvers: resolverMap,
  // This is the new part:
  permissions: permissionMap,
})

Similar to resolvers this permissionMap would map to the schema:

const permissionMap = {
  Query: {
    post: () => { /* ... */ }
  },
  Post: {
    notes: () => { /* ... */ }
  }
}

Depending on if you return a truthy or a falsey value from these functions folks get access or a "Permission denied" error. The functions get the same information as resolvers. (i.e. obj, args and context)

This works for the pre-database case (only logged-in users should be able to access), since I could do something like this:

// The user comes from the context, and is passed in from the server
// Note: This is simplified, there's more complex checks to figure out if somebody is logged in of course
const isLoggedIn = (_, _, { user }) => user;

const permissionMap = {
  Query: {
    post: isLoggedIn
  },
  Post: isLoggedIn
}

Now the issue is post resolver permission functions. I don't have a good idea how to handle these right now, but maybe something like this could work where each "path" has two magic properties (.pre, .post) which run those functions before and after the resolver:

const isLoggedIn = (_, _, { user }) => user;
// Check if the user requesting the resource is the author of the post
const isAuthor = (resolverResult, _, { user }) => user.id === resolverResult.author;

const permissionMap = {
  Query: {
    post: isLoggedIn
  },
  Post: {
    '.pre': isLoggedIn,
    notes: {
      '.post': isAuthor,
    }
  }
}

What do you think? Any previous discussions I'm missing?

mxstbr commented 7 years ago

Another possible avenue (as suggested by @fubhy) rather than magic .pre/.post properties could be to simply pass a resolver executer, and let the user take care of running validations before/after:

const permissionMap = {
  Post: {
    notes: (resolverExecuter, args, context) => {
      if (!isLoggedIn(context.user)) return false;
      // Execute the resolver
      const data = resolverExecuter();
      return isAuthor(data);
    }
  }
}

resolverExecuter here is a closure around each resolver that takes care of passing the right variables to it and to return the unchanged data back to GraphQL. (otherwise the permissionMap could manipulate the data) It would also keep track of it's executed state, and if it's not been executed and the permission function returned true it'd execute the resolver itself, something to this effect:

let executed = false;
let data;

const resolverExecuter = () => {
  data = field.resolver();
  executed = true;
  return data;
}

if (field.permission(resolverExecuter, args, context)) {
  if (!executed) data = field.resolver();

  return data;
}

That way the developer has full control over when to execute which validations and which data they need, rather than relying on magic properties.

helfer commented 7 years ago

Hey @mxstbr, I think that's a really cool idea. I like the first syntax for defining it (permissionsMap), because it's much easier to read. You're right that something like .pre and .post will be needed, which would make it slightly less elegant. An alternative would be to have a permission be defined on a type itself along the lines of:

{
  Query: {
    post: isLoggedIn, // just an example
  }, {
    Post: {
      __self: and(isLoggedIn, or(isAuthor, isPublished)),
      notes: isAuthor,
  },
}

The __self permissions would be applied to every resolver that returns an object of that type, and it would get that object and the context as input, but not the args. The non-__self permissions would get the usual data that the resolver gets.

I think this could look really neat, and would be great to have in graphql-tools. Do you have time to help us make it happen?

Another approach would be to actually push permissions down to the model layer (which is what FB does internally). We're thinking about a next version of graphql-tools that would organize schema and resolvers a little differently and lets you define the models along with it. I think @thebigredgeek has been thinking about stuff like this, so he might have some thoughts to share here.

Also, checking permissions is just a form of validation. Some validations (as you pointed out) can be done statically, some can only be done during execution. It would be really neat if the static ones could be run before the query even begins execution. I recently had a conversation with @JeffRMoore about this, so it would be really interesting to have his take here as well. I think it could be relatively easy to create a validation rule that checks "static" permissions before executing if we mark them somehow.

mxstbr commented 7 years ago

I like the first syntax for defining it (permissionsMap), because it's much easier to read.

The second syntax is still the same, except rather than having .pre/.post you have a function that lets you do your thing!

__self sounds like a great idea though, I like it.

Do you have time to help us make it happen?

I'm exactly 0 familiar with the codebase, but I'd be happy to dig in seeing as I need to add permissions to my server. :wink: Let's wait and see what @thebigredgeek and @JeffRMoore have to say!

stubailo commented 7 years ago

Yeah those two have been thinking about this a lot!

southpolesteve commented 7 years ago

I saw @mxstbr tweet about this thread and wanted to chime in with how we do auth on the bustle graphQL API. I extracted the code below but this is the general setup:

import {
  defaultFieldResolver,
  GraphQLID,
  GraphQLNonNull,
  GraphQLError,
  GraphQLObjectType,
  GraphQLString,
  GraphQLSchema
} from 'graphql'

const User = new GraphQLObjectType({
  name: 'User',
  fields: {
    id: {
      type: new GraphQLNonNull(GraphQLID)
    },
    email: {
      authorize: (root, args, context) => {
        if (root.id !== context.currentUser.id) {
          throw new GraphQLError('You are only authorized to access field "email" on your own user')
        }
      },
      type: new GraphQLNonNull(GraphQLString)
    }
  }
})

// Walk every field and swaps out the resolvers if an `authorize` key is present
function authorize (schema) {
  Object.keys(schema._typeMap).forEach((typName) => {
    const typ = schema._typeMap[typName]
    if (!typ._fields) {
      return
    }
    Object.keys(typ._fields).forEach((fieldName) => {
      const field = typ._fields[fieldName]
      if (field.authorize) {
        typ._fields[fieldName] = wrapResolver(field)
      }
    })
  })
  return schema
}

// wrap resolver with authorize function
function wrapResolver (field) {
  const {
    resolve: oldResolver = defaultFieldResolver,
    authorize,
    description: oldDescription
  } = field

  const resolve = async function (root, args, context, info) {
    await authorize(...arguments)
    return oldResolver(...arguments)
  }

  const description = oldDescription || 'This field requires authorization.'
  return { ...field, resolve, description, authorize, isAuthorized: true }
}

const query = new GraphQLObjectType({
  name: 'Query',
  fields: {
    user: {
      type: User,
      args: {
        id: { type: new GraphQLNonNull(GraphQLID) }
      },
      resolve: (root, args, context) => context.db.find('user', args.id)
    }
  }
})

export default authorize(new GraphQLSchema({ query }))
mxstbr commented 7 years ago

Thanks of chiming in, that's remarkably similar to what I'm thinking! πŸ‘Œ

Seeing that example of using root to avoid having to run authorization checks before and after, maybe we don't even need .pre/.post or anything like that?

const isLoggedIn = (_, _, { user }) => user;
// Check if the user requesting the resource is the author of the post
const isAuthor = (root, _, { user }) => user.id === root.author;

const permissionMap = {
  Query: {
    post: isLoggedIn
  },
  Post: {
    __self: isLoggedIn,
    notes: isAuthor
  }
}
thebigredgeek commented 7 years ago

What are the benefits to using a permissions map compared to, say, composable resolvers that simply throw errors if state preconditions aren't met? This is what I am already doing and it works really well. I don't know if additional API surface area is necessary to solve this challenge

mxstbr commented 7 years ago

Sure, one could do that (it's what @fubhy is doing right now and suggested I do) but I found it makes the whole codebase more complex, unnecessarily so.

You can't write resolvers inline anymore, you have to understand how to compose functions (you and I do, but why do the juniors in your team have to?), permission functions are this limbo that kind of live with resolvers but kind of don't etc. (in my opinion, of course. Happy to be proven otherwise!)

It also means you have to define totally unnecessary resolvers for scalar types, as post.notes might just be a string that doesn't need a custom resolver, but now I have to write a custom resolver because I can't handle permissions otherwise:

const resolverMap = {
  Post: {
    // post.notes is just a String, but now I need to define a custom resolver just to add permissions even though GraphQL could get this scalar type perfectly fine without the custom resolver
    notes: (root, args, context) => isAuthor(root, args, context) && root.notes,
  },
};

Resolvers are a mapping from a path (post.notes) to a database query (db.getNotes(postId)) and, kind of like controllers in a MVC system, I've found that keeping them as small as possible helps keeping the codebase easy to grok. (types: output to/input from the user, resolvers: mapping of that input/output to db queries, models: database handling)

Now, one could argue that this means permissions should be pushed down to the model layer (skinny controller, fat model in traditional MVC-speak) and handled there, but that's unwieldy because you have to pass the context and root of the resolver down to the model for every. single. query. in every. single. resolver:

const Post = require('../models/Post');

const resolverMap = {
  Post: {
    // One would have to pass { root, context } to every query πŸ‘‡ in every resolver
    notes: (root, args, context) => Post.getNotes(args.postId, { root, context }),
  },
};

That (again) adds a lot of complexity when working in the codebase because one has to remember which queries get passed that and which don't, and that you have to pass them all the time.

That's my logic and why I think permissions should live on a separate layer from resolvers and models. I don't know if permissionMap is the best solution, maybe there's a better one, but I feel like there needs to be a different way of doing this.

smolinari commented 7 years ago

but that's unwieldy because you have to pass the context and root of the resolver down to the model for every. single. query. in every. single. resolver:

Why would the root be needed, when resolving the data? I mean, I can only imagine the user (user context) being needed, similar to what Dan Schafer points out in his talk about how Facebook goes about authorization.

Scott

Aurelsicoko commented 7 years ago

Like @southpolesteve, I saw @mxstbr tweet about this thread. I think the solution given by @southpolesteve is the way to go. However, this is too much dedicated to Authorization. This new layer should be more generic and be able to execute an array of functions whatever their purpose.

To give you an example, on a mutation, sometimes you have to format the incoming data or check if the email is available or not. In this case, you have to execute functions which are not specific to Authorization and we have to execute these functions before the query/mutation resolver.

IMO, we have two levels:

I kept the exact same example but I replaced the authorize key by a new one called policies and I'm passing an array of functions which are executed in series before the resolver. Maybe, we should have the exact same thing for executing an array of functions after the resolver, and this solution becomes closest to the pre/post one of @mxstbr.

Note: I haven't tested this code but it's just to give you an idea of how we could implement it.

import {
  defaultFieldResolver,
  GraphQLID,
  GraphQLNonNull,
  GraphQLError,
  GraphQLObjectType,
  GraphQLString,
  GraphQLSchema
} from 'graphql';

import {
  isAuthenticated,
  isOwner
} from './policies';

const User = new GraphQLObjectType({
  name: 'User',
  fields: {
    id: {
      type: new GraphQLNonNull(GraphQLID)
    },
    email: {
      policies: [isOwner],
      type: new GraphQLNonNull(GraphQLString)
    }
  }
})

function policies (schema) {
  Object.keys(schema._typeMap).forEach((typName) => {
    const typ = schema._typeMap[typName]
    if (!typ._fields) {
      return
    }
    // Be able to execute functions before the query/mutation resolver.
    if (typ.policies) {
      typ.resolve = wrapResolver(type);
    }

    Object.keys(typ._fields).forEach((fieldName) => {
      const field = typ._fields[fieldName]
      if (field.policies) {
        // Be able to execute functions before the field resolver.
        typ._fields[fieldName] = wrapResolver(field);
      }
    })
  })
  return schema
}

// wrap resolver with authorize function
function wrapResolver (field) {
  const {
    resolve: oldResolver = defaultFieldResolver,
    policies
  } = field

  const resolve = async function (root, args, context, info) {
    // Execute the array of functions (policies) in series.
    for (let i = 0; i < policies.length; ++i) {
      await policies[i](...arguments);
    } 

    return oldResolver(...arguments)
  }

  return { ...field, resolve, policies}
}

const query = new GraphQLObjectType({
  name: 'Query',
  fields: {
    user: {
      type: User,
      args: {
        id: { type: new GraphQLNonNull(GraphQLID) }
      },
      policies: [isAuthenticated],
      resolve: (root, args, context) => context.db.find('user', args.id)
    }
  }
})

export default policies(new GraphQLSchema({ query }))

The policy (function) could look like this:

// ./policies/isAuthenticated.js
export default (root, args, context, info) => {
  if (root.id !== context.currentUser.id) {
    throw new GraphQLError('You are only authorized to access field "email" on your own user')
  }
}
mxstbr commented 7 years ago

Why would the root be needed, when resolving the data?

What about only the author of a post being able to read the notes? (see the OP) Can't do that based on the context and the args alone, you need to have the data for that.

This new layer should be more generic and be able to execute an array of functions whatever their purpose.

I like that idea, rather than being specific to permissions it could be a generic validation layer, no matter if format or permissions or anything else! The API could still stand as discussed above, except renaming it to validationMap?

smolinari commented 7 years ago

What about only the author of a post being able to read the notes? (see the OP) Can't do that based on the context and the args alone, you need to have the data for that.

Hmm... I would have thought the data would be outside the resolver, like Facebook does with DataLoader.

Scott

mxstbr commented 7 years ago

Sure, that might be the case, but I feel like if we have the data why not pass it in and let people decide how to handle it? Also, not everybody might be using DataLoader etc. (it's also a pretty fine detail of a pretty big API πŸ˜‰)

Aurelsicoko commented 7 years ago

I like that idea, rather than being specific to permissions it could be a generic validation layer, no matter if format or permissions or anything else! The API could still stand as discussed above, except renaming it to validationMap?

Yes, it makes sense to use a more generic name. validationMap seems great to me!

Based on my personal experience, it was a nightmare to pass the context into the DataLoader because it only accepts one argument... Despite this, I agree with @mxstbr if we have it, we should probably pass the data in the context and let people decide how to handle it.

smolinari commented 7 years ago

Pulling back and looking at this from an overarching 1000 mile up in the sky view, I'm of the opinion that the authorization should be on top of the business logic and not in the API. So, this goes along with what @helfer said in his first post with:

Another approach would be to actually push permissions down to the model layer

My main reasoning for this is based on the assumption that at some point in the evolution of GraphQL (and I'll be money Facebook is doing it now), I believe the creation of the API will be mostly an automated process. Having permissions mixed in the API layer would make that automation programming a lot more difficult.

This graphic is best practice according to the makers (and how I see it too):

Also, in this great article from @helfer, he also says this:

  1. Permission checks are best implemented on nodes
  2. Permission checks are best separated from data fetching logic to make sure they are applied consistently, regardless of how the data is fetched.

I'd also actually improve number 2.

  1. Permission checks are best separated from data fetching and API logic to make sure they are applied consistently, regardless of how the data is fetched or requested.

Which basically means, the authorization could be available for any type of API (as the image shows too) and not be concerned with data fetching concerns.

Such "business details" as authorization is opinionation and again, should be avoided in any API. Sure, it means no standard for authorization within (Apollo's) GraphQL implementation. But, that is, again in my opinion, a good thing.

What should first be agreed upon is where authorization should take place or where it shouldn't. I say, it should definitely not be within the GraphQL API layer.

That is just my (last) 2 cents. πŸ˜„

Scott

mxstbr commented 7 years ago

In that graphic you posted there's two layers:

The "persistence layer" are the models, they persist stuff to the database. Pushing the business logic into that seems to even be discouraged by that graphic, as far as I can tell.

What I'm trying to figure out here is basically where that "business logic" layer livesβ€”I don't think resolvers are the right place for that, models apparently aren't either, which leaves us with.. not much :wink:

That's why I think adding an explicit API to graphql-tools is helpful.

smolinari commented 7 years ago

To me models are only the business logic. Theoretically, a good model system should not know any details about the databases/ data sources it connects to. Because, one day you might be using MySQL, then realize some other database(s) are even better solutions. You'll notice, any good ORM can connect to multiple SQL databases. This abstracted out connectivity to any database/ source is the persistence layer. It is also the layer where a tool like DataLoader would do very well. πŸ˜„

Obviously, we're talking about a lot of abstraction and for a relatively simple system, it is probably a good bit of overkill. However, this separation of concerns is how I believe (and Facebook also teaches) it needs to be, if your system is going to be future-proof.

Scott

AndrewIngram commented 7 years ago

I want to weigh in here, because this has the danger of setting up some hasty de facto standards, and hasty is the last thing we should be when it comes to security.

I've built GraphQL servers on top of existing micro-service platforms, and in new projects where GraphQL and "the platform" live in the same app instance. Once thing i've never done is treat the GraphQL server as trusted. I can make every API that my resolvers call into a public one without breaking the security of my system. In the case of the existing systems, this was necessary (all our APIs are externally accessible); in the new systems it just seemed like a best practice. In both types of implementation (micro-service vs monolith), the way i've done the GraphQL server has been identical - resolvers do some relatively trivial unpacking they're parameters, than call a function that's been defined elsewhere as the interface to "the business layer". Everything behind those functions can change as much as it likes, I can switch databases and require no code changes to my GraphQL server. This is good.

Permissioning is a concern that all developers need to think about, and I think most will be nervous about having the outermost layer of the stack be responsible for handling it.

However, there's also talk of pre/post behaviour when evaluating Types. i.e. whenever I have a Product, regardless of how I get to it. There's value in exploring some ideas here, but in a more general way than specifically tackling permissions or validation. I looked into this very thing when trying to find a generalised solution for the look-ahead problem (i.e the final part of what I talk about in this article: https://dev-blog.apollodata.com/optimizing-your-graphql-request-waterfalls-7c3f3360b051). Ultimately I found other ways to do it (I wrapped dataloader), and felt better about not trying to bolt on additional unnecessary API to my GraphQL server, but I still think it's worth playing around with the idea to see what other utility it could bring.

One other parting thought. I believe the sweet spot of GraphQL is largely as @smolinari has pointed out. It takes relatively little effort to get to this type of separation, even in a monolithic codebase, and the benefits it brings for long-term maintainability are real. I think effort would be better spend producing examples of medium-sized codebases that follow these patterns.

mxstbr commented 7 years ago

To me models are only the business logic. Theoretically, a good model system should not know any details about the databases/ data sources it connects to.

You're right, good call!

I can make every API that my resolvers call into a public one without breaking the security of my system.

This is a very interesting point. You're right that we loose this property with the current API suggestion, I agree that's suboptimal. I don't think I agree that this means we shouldn't think about adding an API for this, it just needs to be a different one.

I'd really like to avoid having to write a ton of resolvers just for validation purposes because the default one would suffice normally. (e.g. notes: (root, args, ctx) => validate(root, args, ctx) && root.notes) Even more so I don't want to drop down to the model layer for that. (e.g. notes: (root, args, ctx) => getNotes(root, args, ctx) which just does the above thing, root => validate() && root.notes) That seems like a lot of unnecessary complexity and code. (maybe it isn't?)

I also want an abstraction that keeps validation separate from the general "business logic" layer rather than randomly sprinkled throughout the codebase. I'm pretty sure this makes it easier to figure out what's not being validated correctly/what's been forgotten and also make it easier to spot bugs since every responsibility (types, mapping types -> models, database handling, permissions/validation,...) is handled at a specific place.

If I have validations somewhere inside my models (or not), I have to start digging a lot more to figure out which specific LoC is broken.

I don't have any good ideas for other APIs right now, but I think it's still worth finding one?

thebigredgeek commented 7 years ago

https://github.com/thebigredgeek/apollo-resolvers < This has made authorization a very simple matter for me and my team, at least with Monoliths.

That said, when working with microservices, I think it often makes sense to delegate authorization logic to the service responsible for the domain of a given resource, so the GraphQL server should probably only care about authentication and perhaps high-level authorization (is the user logged in or not, what is the high-level role of the user, etc).

I think you can accomplish more granular authentication rules by simply passing user data (such as their ID) to your data layer (models + connectors OR separate services, depending on your paradigm) and letting said data layer decide what the user should and shouldn't be able to do.

Ultimately, I think this conversation is more about how to handle authorization within the domain of GraphQL and not within the domain of the data layer, so I believe a line has to be drawn regarding best practices. The challenges facing GraphQL authorization are congruent with the challenges facing RESTful authorization, with the exception of data traversal restrictions that are necessary in GraphQL but often not with REST. I think we should avoid reinventing the wheel, treat GraphQL as a transport and traversion DSL, and reuse as many paradigms from the REST world as possible.

stubailo commented 7 years ago

@AndrewIngram I think it's good to bring that up - I do think it's valuable to explore what other paradigms might look like, though.

this has the danger of setting up some hasty de facto standards, and hasty is the last thing we should be when it comes to security.

I agree we should make sure that there is some consensus among a wide range of people before this moves into the sphere of being some kind of recommendation or best practice, but I think that always starts with a discussion that considers a wide range of approaches.

AndrewIngram commented 7 years ago

Another thing to consider - Even if you decide not to handle permission checks themselves in GraphQL, you'll definitely want to handle the effects of them.

Much like I prefer to handle validation errors from mutations as part of the schema design itself (I prefer reserving error handling for actual errors, not expected validation failures), a similar argument could be made for failed auth checks.

Let's say we have a screen on our app, and the data is fetched via a static query (as is becoming best practice), there may be some parts of that data that are only accessible to us when we have the right permissions. Schema-wise I think it's messy to just allow those parts of the schema to be nullable, because it forces us to overload what null means. I tend to advocate going modelling auth restrictions into the schema itself.

type User implements Node {
  id: ID!
  username: String!
  articles: [Article]
  private: RestrictedPrivateUser
}

type PrivateUser {
  email: String
  secret: String
}

type InsufficientPermissions {
  message: String!
}

union RestrictedPrivateUser = InsufficientPermissions | PrivateUser

I appreciate that this is a lot more verbose than just allowing email and secret to be nullable (and I hope that as graphql evolves we gain a more expressive type system to allow things like anonymous unions). But If I were to move those fields onto User itself, a null value for either of them doesn't tell me enough - i'm forced to look at the query error to figure out what the null means. But why is this an issue? Well mainly because I don't considered expected behaviour to be an error. Remember, we have a screen powered by a single static query, a query that works fine for one user might be full of errors for another, and for a permission-heavy app this might mean that a significant amount of front-end complexity is dedicated to pulling things out of the errors object.

Another reason I like this approach, is that it forces the front-end developers to handle these cases, much like we'd be doing if we were using a language like Elm. It's a total pain in the arse, but it leads to better software.

thebigredgeek commented 7 years ago

@AndrewIngram don't you have to implement a lot of duck typing on the client to properly handle whether or not a PrivateUser or an InsufficientPermissions is returned?

AndrewIngram commented 7 years ago

You'd just query for and branch on __typename

thebigredgeek commented 7 years ago

Ah interesting. Have you had any issues working with that kind of model? I've been keeping all of my errors in the errors array, but I've been thinking about placing them in context-specific places for a while. Can you speak on benefits as well?

AndrewIngram commented 7 years ago

It's mainly inspired by patterns I see in more functional languages, where most errors are treated as data rather than as a bailout. It feels like a better way of encouraging API consumers (in our case, client app developers) to properly handle the various ways things can go wrong.

Why model into the schema rather than the errors array? Essentially, you're going to end up with a variety of error type; timeouts, missing data, failed auth, form validation, each of which is likely to want to evolve over time to capture more nuance. A data structure suitable for modelling a timeout is unlikely to be identical to one suitable for modelling a form validation error. When we just throw all of these into the errors array, we're requiring that clients are aware of all the various types of errors and what they look like - otherwise how do they pull the data out? You end up with your own ad-hoc mini schema just for errors, but largely separate from the very technology aimed at allowing client-data coupling to be done cleanly - i.e. GraphQL.

This is why I prefer the errors array to represent errors within the GraphQL server itself, rather than errors exposed by the underlying data model.

Drawbacks? It's initially a lot more work to do it this way, the downsides of using the errors array aren't initially obvious (and you may reach the point where it's an issue). It can also be awkward to have to wrap scalars in object types just so they can be used in a union. If you want to move fast and break things, you're going to be very resistant to the approach i'm suggesting. I've mainly only tried this with toy problems, where it's the most painful (I don't have a large suite of error types built up in my schemas) to implement, so my feeling that it gets easier over time is based on inference rather than real-world experience.

helfer commented 7 years ago

To get back to the original conversation about authorization, I think there's a lot of value in having a clear place where permissions checks happen. I agree that they're only one kind of query and parameter validation, but they're an extremely common kind so I think it helps to give guidance around that.

@AndrewIngram 's point about many applications already having authorization in some backend or other is certainly true, but I think we shouldn't take architecture choices made prior to the addition of GraphQL as evidence that this is the right thing to do in GraphQL. I would argue that the opposite is true. Taking GraphQL as a given (which makes sense for a repo called graphql-tools), we should figure out a good way of doing authorization.

As mentioned earlier in the conversation, I think it makes sense to check permissions in a layer that's separate from resolvers, and also separate from loading data from the backends. Why? Because that gives you maximum flexibility when it comes to caching. Writing a permissions-aware caching layer is (in my humble opinion) a fool's errand. Assuming that we want the cache to live as close to the edge as possible for performance reasons, the cache should be in or near the GraphQL layer, and permissions should be checked after retrieving items from the cache (with some exceptions where it's trivially possible to determine permissions without materializing the object). Strictly speaking this would mean pushing authorization to the model layer, but I think there are plenty of GraphQL servers which won't need a lot of business logic and which can do validation / authorization in a simpler way. I think the ideas that @mxstbr had would be a great fit for that kind of application.

But rather than just talking about it, I think it's most useful if we try out our different hypotheses with a couple of code examples like @mxstbr 's PR #315.

AndrewIngram commented 7 years ago

To clarify, my points aren't based on the assumption of building GraphQL on top of an existing system, they could apply equally to a new system. It's more based on the assumption that GraphQL isn't the exclusive route for data access.

What i'm arguing against is an explicit API for permissions (or input validation, but I find it easier to buy in to that one) in graphql-tools. @mxstbr's solution could easily be implemented as a generic preResolve and postResolve API, which brings all the benefits, but none of the connotations that this is the one true way to handle access control.

mxstbr commented 7 years ago

I went ahead and created a rough implementation of the API that was suggested above in #315. This is what it looks like:

# A written text by an author
type Post {}

type Query {
  # Get a post
  post(id: ID!): Post
}

schema {
  query: Query
}
makeExecutableSchema({
  validations: {
    Query: {
      // Returning false here will not execute the resolver when post is queried
      // Can also return a Promise if necessary
      post: (root, args, context, info) => true
    }
  }
})

Note: It only stops executing the resolver, but doesn't return an error to GraphQL just yet because I haven't figured out how to do that)

I think that works pretty well with what @AndrewIngram is saying, since it's not explicitly for permissions, it's just generally "Do you want to execute the resolver at this field yes/no", right?

Aurelsicoko commented 7 years ago

@mxstbr I really like this approach. It doesn't mean: this is the guideline to handle permissions. It's just a new feature provided by graphql-tools to execute or not the resolver. More than that, people who don't want to use this pre/post feature to handle permissions could easily use others patterns without having the feeling that they are not following the guidelines.

It's generic and it could be used for others things such as input validations, etc.

AndrewIngram commented 7 years ago

@mxstbr shouldn't the example be for pre rather than post? I'd assume with pre we're asking "should I continue resolving?", whilst with post we're asking "I have the data, do we need to do anything with it before returning?"

mxstbr commented 7 years ago

This runs before the resolver, check the code in the PR, I haven't implemented anything that runs after the resolver yet as I'm not 100% on what the API for that should look like yet.

mxstbr commented 7 years ago

I just realised what @AndrewIngram is referring to and It's my bad for a horrible example! post in the example above refers to a defined query, not the .pre/.post thing:

# A written text by an author
type Post {}

type Query {
  # Get a post
  post(id: ID!): Post
}

schema {
  query: Query
}

I've updated the post above to include the schema!

AndrewIngram commented 7 years ago

Oh, haha!

JeffRMoore commented 7 years ago

Wow! There is a lot to follow here and some pretty cool ideas.

I don't think authorization will ever be a "solved" problem, nor should it be. My experience is that is a good way to tip a library over into being a framework. I mean that in a negative sense.

And yet, both validation and authorization are extremely important and there needs to be a good story or stories around them.

I advocate that there are three basic approaches to validation in APIs (Try and fail, dedicated api, and metadata) and that any sufficiently complex API will sprout all three forms.

I think the same can happen in Auth. I could easily see a schema driven auth system that is completely applied as a validation on a query AST. I could also see a schema driven system that lives as resolver decorators, or in the business logic backend. Or imperative checks, hand coded into resolvers, or somewhere in the interface with http. Or all of the above.

Auth can be really simple. Or it can be a nightmare.

Rather than adding auth-specific behaviors, I'd like to see the tooling around extension points in the query process become more robust. I really like the idea of expanding graphql-tools as a way to do that. I think I'm agreeing with @thebigredgeek on that.

Maybe a solution lies is some sort of support for directives in graphql-tools, where directives in the schema can help choose a composition of resolvers specified separately from the schema as a parameter to makeExecutableSchema?

That might help reduce boilerplate, help make the code more audit-able, but not impose a specific approach, leaving that to plugins.

JeffRMoore commented 7 years ago

I'm not sure there is a need to distinguish between pre and post, only the order in which resolver middleware functions are called. Pre or post is a feature of the actual implementation.

Fleshing this out with some hasty code:

# I want posts to only be readable by logged-in users
type Post 
@isLoggedIn()
{
  id: ID
  content: String
  author: User
  notes: [String]  # I want these to only be accessible by the author
  @isAuthor()
} 
makeExecutableSchema({
  ...
  directives: {
    isAuthor: () => { ... }
    isLoggedIn: () => { ... }
  }
})

A pre- resolver middleware:

async (request, next) => { pre(); return next(); } 

A post-resolver middleware:

async (request, next) => { result = await next(); return post(result); } 

This presumes the function signature of resolve has been turned into single request object.

The order should just be the order in which they are declared.

thebigredgeek commented 7 years ago

@mxstbr my main concern with doing something like

makeExecutableSchema({
  validations: {
    Query: {
      // Returning false here will not execute the resolver when post is queried
      // Can also return a Promise if necessary
      post: (root, args, context, info) => true
    }
  }
})

is that I now have to define a separate deep permissions object that looks almost identical to my resolver structure and then keep those two things in sync with one another. I can already foresee headaches with this in my own personal process (albeit, I have really bad OCD with things that feel duplicitous in structure haha). I can imagine myself incorrectly mapping things by accident. In my opinion, if ACL is going to be handled by GraphQL it should probably be colocated / composed with the actual resolvers OR types.

@JeffRMoore I REALLY like the directive approach. I can't stress enough how happy doing something like that would make me feel. However, I am wondering how easily composition would actually be. It seems like I might end up with 100 different ACL directives as my codebase grows? It seems like it might actually be difficult to make the directives highly reusable between different types. If I can't reuse them, it seems like they are simply syntactically cool (really, it looks dope) but without adding much benefit?

smolinari commented 7 years ago

After thinking about this some more, I can't stress enough how much I believe how bad of a mistake it would be to inject a permissions system inside of a GraphQL API. It's simply the wrong place for it. It is badly shortsighted. I'll give some reasons and yes, I am sort of repeating myself from above. But, it seems my last comment didn't make much of a dent in the contemplations here. Here are the reasons why GraphQL should never see or know about permissions:

  1. Because permissions are business logic. In a business, one or more people FROM the business are going to be making the decisions on who is going to see and do things in the system. And that means, it shouldn't take a programmer to change permissions. The decision to put permissions in the API would mean a never ending story for the programmer. He or she or they would be the opposite of the 10X programmer. When a type needs to be declared, the programmer has to also run to the business or remember to always ask, "who should have access to this data?" OMG! Programming should always be fire and forget and the goal should be to never have to touch the programming again and certainly not for permission changes. Ok, the creation of the API might be automated too, so see point 4.

  2. There are different depths of permission needs. Some organizations might want a simple system. Some might want a very, very complicated system with very high access control fidelity. Any solution found MUST account for ALL of those permission use cases. That is another bad headache to try and solve at the API layer level. It means it is on the Apollo GraphQL community's back to make sure all the possibilities are covered and that is a responsibility we really, really shouldn't have. At least not now and with GraphQL. We've got other much more pertinent problems that need solving. Making the internal API for permissions something abstract so permissions may or may not be added isn't the answer either. Or rather, if there is a need for this internal API, it shouldn't be for permissions, period.

  3. Some use cases of a system might employ multiple APIs (like in the image from Facebook I offered above). This kind multi-API system might also happen through evolution. Introducing permissions inside GraphQL might seduce start-ups to do their permissions inside their GraphQL API. When they decide to go multi-API, they will be quite a bit screwed. In other word, injecting permissions into the API closes the door or causes headaches for the use of other APIs. That alone is a no-go.

  4. The argument or thinking might be, it is a just schema. It is metadata. So why not inject permission information there too? Ok. But, at some point the GraphQL API creation will have to be automated, simply because it can and should shadow the true model graph of the business, which is also created in the business logic. It is duplication of sorts. In other words, the model of the business will be created first, within the business logic layer (and in most cases again, with the least bit of programmer help, i.e. through a smart backend) and the API should only represent an automated gateway into that logic. Adding permissions into the API means, whoever has to build this automation will have to include the authorization logic too, which is already in the business logic!!! That would be duplicated and thus unnecessary work.

I am so against this idea now. It is simply a bad one at too many levels, when thinking in longer more evolved terms. And, these evolutions will happen for sure. During the F8 talks, one of the devs mentioned that FB's GraphQL API is automated (in a roundabout way) as I had guessed. I had a good internal giggle, when she said it. I thought to myself, "Ha! I knew it!". πŸ˜„

Facebook knows what is right or rather smart, because they have already evolved way further than what we are seeing or know ourselves. And again, Facebook suggests to push permissions to the business logic level. I wholeheartedly agree. I think we should avoid talking about adding anything to do with permissions in GraphQL directly, at all.

I also believe the word "resolver" causes beginners or inexperienced devs to think that is where the data fetching from the source should occur. That is incorrect thinking. There must be a layer of business logic in between the resolver and the data sources. In other words, if anyone is putting REST calls or database queries in a resolver, they are doing GraphQL completely wrong from the start or rather, they are building up a shortsighted and problematic to evolve API. Adding permissions into the API layer is along those same lines of thinking.

What I'd like to see from Apollo is a smart business logic layer, which plays really well with a GraphQL API. And I'd be glad to help with the plug-in-ability of permission system add-ons. πŸ˜„ We should be working towards a GraphQL platform and not a Leatherman kind of "all-in-one" API layer. That won't get us very far in the long run.

If anyone wants to continue with suggesting the API should have permissions, then please make arguments that out-trump my arguments. Please tell us how doing permission in the API will be good for everyone and in what ways. In other words, we shouldn't be discussing solutions at all at this point, but rather, we should first discuss the smarts, the advantages and disadvantages, behind the decision to add permissions to Apollo's GraphQL API.

Scott

helfer commented 7 years ago

I like the idea of middlewares and directives as well. I didn't have time to read in detail, but just wanted to point out that directives come after the thing they apply to, so the example from @JeffRMoore 's comment would have to be written as follows:

# I want posts to only be readable by logged-in users
type Post @isLoggedIn {
  id: ID
  content: String
  author: User()
  notes: [String]  @isAuthor # I want these to only be accessible by the author
} 
JeffRMoore commented 7 years ago

@helfer Thanks for the clarification, I updated my prior comment.

@thebigredgeek I thought you would like it. It fits with the architectural style you've been promoting (which I like very much). As for your concern about combinatorics, I don't think that will happen. Or at least, if one finds oneself running into that issue, its probably an indicator of something else wrong. I did several authorization iterations for an enterprise SaaS product. Each one ended up simpler and faster, and in the last, we were able to accommodate almost everything with less than a dozen primitive patterns. (That system was schema based, but auth information was represented as boilerplate rather than metadata. Next time around, I would do metadata.)

Let me clarify with some types. Again this is hasty. Tell me if I'm screwing up.

type Post 
{
  id: ID
  content: String
  author: User
  notes: [String]  @restrictToAuthor()
} 
makeExecutableSchema({
  ...
  directives: {
    restrictToAuthor: (params: GraphQLDirectiveParameters): GraphQLResolvingMiddleware => 
      {
        return restrictToAuthor;
        async function restrictToAuthor(request: GraphQLResolvingRequest, next: ContinueMiddleware): mixed {
        if (!AuthorCheck(/* whatever this looks like */)) {
           throw new ValidationError('Denied');
        }
        return next();
      }
    }
  }
})

Because the directives can have parameters, each time the directive appears in the schema, a distinct resolving function is created based on those parameters. That will kill a lot of combinatorics.

This solution lets you implement a lot of different kinds of things, like validation, authorization, or even ORM or backend fetching hints.

There are a lot of devils in the details of what GraphQLResolvingRequest looks like. Experimentation with performance implications is warranted.

If those details pan out, it may make sense to move that upstream into graphql-js. GraphQLResolvingRequest is a stronger concept than GraphQLResolveInfo. The benefit of enabling middleware may warrant a BC break someday. Until then, an adapter is needed.

thebigredgeek commented 7 years ago

@JeffRMoore I like it. Especially now that you've mentioned that you've iterated towards this pattern (with enterprise, no less! haha), I am sold on the DX / code cleanliness aspects. I am also curious, as you said, about perf impact. It might be nice to prototype this approach and bench mark it against non-authenticated with sync authentication functions just to see what kind of initial overhead this pattern would introduce. With async, I think the pattern of using something like redis or memcached (just like with every other type of API when it comes to highly-accessed data) is the right solution. Were there any particular cases that make you feel concerned about perf?

mxstbr commented 7 years ago

I like that approach a lot @JeffRMoore! Definitely much easier than duplicating the type tree again. Can that live in graphql-tools?

smolinari commented 7 years ago

I'd like to point out a principle of introducing any permissions or access control to a system. When you set such a system up, it must always offer the ability to deny all access first, to then give access, only where it is applicable. Please think about that, when coming up with any permission or access control solution within the GraphQL API, if it absolutely has to be done.

Scott

AndrewIngram commented 7 years ago

@JeffRMoore can we dive into your solution a bit, because whilst I like the idea of directives as an API for a meta-schema, I'm skeptical that this is the right use for them. The main benefit seems to be that it's an aesthetically-pleasing API.

The first issue is that I'm struggling to see how you'd have a generic directive for @restrictToAuthor. It seems like it'd either depend on a consistent pattern in your data, so that author is always accessible via the same field on an object (possible, but too magical for my tastes). Or it would need to be parameterised eg @restrictToAuthor(authorField="author"), but again, this feels like tying the schema to the data, and there's no contract that says the author field would be available before you're trying to resolve notes.

There's also the issue that this still feels like the wrong place to be defining this restriction. What if we want admins to be able to access everyone's notes, we now need to change our implementation at multiple layers of the stack.

So far, @helfer's concerns about caching are the main compelling argument as to why a GraphQL server might need to have some kind of permission check functionality. But this also seems to be highly implementation-specific. I'd be very interested to hear how Facebook has solved distributed caching in GraphQL without messing up the nice clean architecture.

smolinari commented 7 years ago

Facebook doesn't let the request through, if the "viewer" doesn't have the proper permissions. The application caching happens after permission checking. Dan Schafer did a presentation about Facebook's way of doing cache and authorization (roughly).

https://www.youtube.com/watch?v=etax3aEe2dA

Of course, he says clearly that what he presents isn't necessarily THE way to run a GraphQL server. But, it certainly is smart to me. He also says the "thin" GraphQL API has turned out valuable to them. πŸ˜‰

Scott

JeffRMoore commented 7 years ago

@thebigredgeek I was thinking mostly of the overhead of constructing the GraphQLRequest object and the closures for applying the middleware chains. Its possibly not a big deal. Something to double check, nothing more.

@AndrewIngram, In the schema, its just meta-data. Its not until you supply the functions with GraphQL Tools that it becomes an implementation. That's the purpose of GraphQL Tools, no? To supply a resolving implementation, separating the description from the implementation?

As for whether that auth information should be co-located with the schema? Should GraphQL queries be co-located with their components or be separate? Should html be co-located or separate? Its just another application of principle of separation of concerns.

If for a particular case it doesn't make sense to co-locate auth metadata with the schema, don't. If it does, do.

I'm pretty sure there's no one right way.

The change cases @AndrewIngram mentions brings up an important point. Maybe there are some BC implications with evolving the API? How does one preserve the "no versioning" convention and evolve the authorization design?

I've used this pattern with validation and it works very well. Think name @maxLength(length=31). The challenge with validation is that you can only make the rules looser, not tighter. If I change to length=63, no problem. If I change my declaration to length=15 any data that has been previously accepted is not valid by the current rule declaration.

Changing validation rules often caused unexpected consequences. For example phone number going from optional to required. Now, what should we do with the user records that have no phone number? Especially when they edit their account info? Should we make them enter a phone number to change their email (also on the same record)? Frustrating. Should we allow partially valid records? Dangerous? Should we only validate changed fields and not consider the validity of the record as a whole? Now we're pushing conditionals into downstream code. We used a variety of approaches to handle this including data migration and versioning the record, and rules like "required only on create or if already specified."

There was a lot of value for us in making constraints a part of the meta-data description of the API. We could do things like user defined fields and apply validation rules in multiple contexts with different implementations. I would want to use this pattern again.

Is there value in the auth realm? Possibly. I might be able to use that information to hide or allow operations rather than allowing them to fail. Or rather than duplicating a hard coded check into the client code, a check that would have to be maintained in parallel with one on the server side, use the metadata to apply the same check both on the client and the server.

JeffRMoore commented 7 years ago

@smolinari That video is good.

Think of "Viewer" as "Authorization Context." For Facebook, that's Viewer. For other APIs, it might be different. In the Facebook example, the action "see" is baked into the function name. That's a part of the concept of authorization context and might be part of a parameter to check authorization instead in a generic authorization library. The action could be "delete" for example or even more application specific. Other environmental factors might be considered part of the context, like the requesting IP address or even time of day. "Joe is authorized to change grades only from an on-campus IP and only during office hours."

They also separate the point of policy enforcement (CheckCanSee) and the point of making the decision (The implementation of CheckCanSee). The example is a really minimal implementation of an important abstraction. This way resolver implementation should not change when Authorization policy changes.

Love how they build audibility into the structure of the system. Really important.

Isn't FB code-generating their resolver implementations?

Schema decoration is compatible with this approach?

smolinari commented 7 years ago

Why don't we just build all of the business logic into the API's abilities? Thick resolvers, thin or even no model. Yeah. That would be so flexible and smart. Right?

The answer from everyone should be an emphatic, "Whoah Scott! No. no. no. Not at all."

Before I continue, I hope we can all agree on one thing. Authorization (or even validation) is business logic. Right?

So why are we discussing authorization as a part of the API? Where is the line drawn or where should it be drawn? FB obviously drew the line very close to the API and said, "let's keep the API layer as thin as possible. There will be very little to no business logic in it! Period!" Why are we fighting this "way of doing it"? Convenience? That is a stupid reason for doing anything and I am only guessing that is the reason. Please don't confuse convenience with simplicity. They aren't always the same and usually aren't. You must all also agree, adding business logic into the API isn't simple either.

What other reasons are there for adding business logic in the API layer? I have yet to hear one plausible reason except for "it's a natural fit". I've offered plenty of reasons why adding such logic into the API itself is a bad idea.

If Apollo wants to build out the server (and they should!), it needs to happen after the API layer and completely separated from it. That is my 2 cents on this (obviously). The "backend" should be a simple as possible set of plug-in-able interfaces for different business logic systems, including but not limited to validation and authorization systems. That's all that is needed. The rest should be done (and has been done to certain extents) by the rest of the Node community.

As we are speaking, teams are nailing these things together. Look at GraphCool. What Apollo should be saving us from is the work of nailing the system together. It should be "server glue".

And yet the system must still be way open to customization and evolution. Think about this. If I have a simple role based authorization system and I decide I want to add sharing rules, peppering the API with authorization directives is going to make my life a nightmare and so much so, I am probably never going to change it and thus, such a solution is clearly a dead end street. It's not good customizability, if that might have been thought of as a benefit.

I've been ignored for the most part in my views here (it seems) and that means either my argumentation is either so wrong, I'd look foolish, if someone points out the obvious I am not seeing and no one wants to make me look that foolish. No worries. I've been badly wrong before and I would love to be corrected or be wrong too. That only means I have a chance to learn.

Or the other possibility is, nobody has any decent argumentation on the benefits of peppering the API with business logic (no matter what logic it may be). Sorry, when that sounds a bit sarcastic, but from my current 1000 mile high view of this, that is what the suggestions given look like to me. Maybe that is my different viewpoint here. I am not an in-the-trenches developer. I am more a business manager or rather that is my real background.

Let's also be clear. If I change any constraints to an application, i.e. max number of characters for a field's size or permissions to a field or function, it isn't API breaking, it is application breaking or rather application constraining. In other words, these BC or constraint contemplations have nothing and should have nothing to do with the API's evolution and thus, isn't argumentation for or against having business logic in the API. The API is (or should be) working properly, no matter what I decide in terms of business or even programming constraints, and no matter which kind of API I have. This also shows, these concerns don't belong together.

Remember SoC? Why are we forgetting this generally accepted as good to follow rule of programming now?

Thanks @JeffRMoore for bringing the discussion more up to the level where I feel it should be. Again, we should be discussing the merits and advantages of adding business logic into the API layer. I don't see any currently, in fact, I see it as a really bad decision. Facebook didn't see any either.

What is the return on investment for adding business logic in the API layer? I've pointed out some risks.

Please answer that question for me and anyone else investing time and effort into any solution going forward, and not only as a solution to be worked on, but as part of the workflow for future developments using this API system. Because, without clear answers to that question, that work might become a complete waste and probably will be, because this direction wasn't properly thought out at the business level. And we certainly don't want people wasting their time, do we? πŸ˜„

Oh, and the argumentation "you can take it or leave it, if you want" isn't one. πŸ˜‰

So, what are the benefits and advantages, the pros, for this direction/ suggestion?

Scott

helfer commented 7 years ago

@smolinari about your earlier comment: I think you're misunderstanding the presentation by Dan. The caching that he talked about is client caching. Caching on the server happens below the permissions layer and below the model layer. In FB's case, the gen function always fetches from a caching layer, which forwards the request if necessary.

As for your most recent post, I'm not going to try and address every point (I'm not even sure what the main ones are), but I'd like to point out that most people aren't going to be running stacks with 5 or more layers like Facebook. The most common case will be either GraphQL directly to database, or GraphQL to REST microservice/monlith to database. In the first case, which I think is not unreasonable, your business logic is going to run alongside or in GraphQL. If having a simple and declarative way of defining validation rules and permissions is the most expedient way for someone to build their app, I don't think we should get in their way. Quite the opposite, I think we should make it easier for them.

In many cases, the perfect is the enemy of the good, and it sounds like your argument against most of the ideas in this thread is that they will break in some hypothetical future where you want to do something else. The fact of the matter is that if you worry too much about coding yourself into a corner, you're never going to write any code. I agree with you that we should avoid horrible mistakes, but I don't think you've made good argument why middlewares or adding metadata to the schema would be a horrible idea.

helfer commented 7 years ago

Just to be clear, I personally think that for a greenfield GraphQL server the best way to build it is to have almost no logic in resolvers, do all fetching through a model layer which checks permissions before or after fetching from the cache (which can be a mix of per-object or per-field caching), and which delegates data loading to data loaders (which I sometimes call connectors).

For GraphQL servers built on top of existing REST APIs or microservices, I don't think those rules necessarily apply, so it might very well make sense to do things differently.

I also think that the graphql schema is a great starting point for developing an app and could be used to get a proof of concept off the ground extremely quickly (You could check permissions and do data fetching purely with directives, or you could use the schema and directives to generate code in the model layer). It might well be possible to build something solid enough for production, so I think rather than arguing against it for philosophical reasons, we should just try it out and see how it works.