Expose user-defined meta-information via introspection API in form of directives

OlegIlyenko commented 7 years ago

With growing popularity of IDL as a way to define a GraphQL schema, I think it would be quite beneficial to expose directive information via introspection API.

From what I can tell, the directive information is the only missing piece of information that is not obtainable via introspection API. For example in this schema definition:

type User {
  id: ID!
  name: String
  admin: Boolean! @important
}

type Query {
  user: User
}

@important directive is only available at schema materialization time, but discarded as soon as schema is materialized from AST definitions.

One can see directives as a way to instruct server to construct the schema is specific way. But I would argue that directives have also a lot of potential as a way to expose additional meta-information about the schema. This may include things like: field cost/complexity (the use case I'm quite interested in), auth information, caching characteristics, and in general any kind of structured meta-information that is specific to a particular application or subset of applications. I think a lot of interesting use-cases and scenarios can emerge as soon as this meta-information is exposed via introspection. I can also imagine community working together on a set of common directives that may be adopted by client tooling and frameworks like apollo and relay. These common directives (exposed via introspection API) may provide deeper insights on the schema, relations between fields and objects, etc.

I described this feature in context of IDL, but in fact it's quite independent from it (though integrates with it very naturally). I was thinking about different ways how this kind of user-defined meta-information can be exposed via introspection API and I feel that directive-based approach is the most natural and well integrated way of doing it.

I would love to hear you opinions on this topic!

calebmer commented 7 years ago

field cost/complexity (the use case I'm quite interested in)

I had some experiments on this that I didn’t release. I’d love to hear your thoughts 😊

I agree that being able to put arbitrary information into introspection is incredibly powerful, but I don’t think that we should be translating directives one-to-one into the introspection. Directives are meta instructions for the tools which consume the IDL. Making them a first class part of introspection reveals too many implementation details. It would also be very tough to type the directive arguments well.

I’d rather see tools that can interpret directives and then translate that to fields in the introspection 😊. For example, a server could:

extend type __Field {
  important: Boolean
}

…and then no matter where you define your schema whether it be in the GraphQL IDL, GraphQL.js, or some other GraphQL server framework this flag can be set.

I don’t like the idea of making the IDL the one source of truth when creating a GraphQL server, but I do really like the idea of allowing users to extend the introspection types with arbitrary extra information.

smolinari commented 7 years ago

I don’t like the idea of making the IDL the one source of truth when creating a GraphQL server

Amen!

Scott

OlegIlyenko commented 7 years ago

I don’t think that we should be translating directives one-to-one into the introspection

I agree, the subset of directives that are used in the IDL may be completely different from subset of directives that are exposed via introspection (they may not even overlap)

I don’t like the idea of making the IDL the one source of truth when creating a GraphQL server

100% agree on this one. The whole idea is quite unrelated to IDL schema definition. Though if meta-information is exposed in a directive format, then some interesting scenarios can emerge. For example this end-to-end scenario falls naturally out of it:

gateway

"Internal Service 1" may use completely different set of directives at creation time than directives that are exposed via introspection to assemble the "Gateway" IDL. But using a directives is quite convenient since they are easily translated to/from IDL.

IDL aside, directives have an advantage that they are also easily introspectable though the API. But in general don't have a very strong opinion on the actual meta-information format. My main motivation is to somehow expose additional user-defined meta-information though introspection API.

Though I have a concern about the format you proposed:

extend type __Field {
  important: Boolean
}

If it is not defined with IDL, then server implementation somehow need to provide a way for user to define additional meta-fields on all GrapohQL schema-related classes and the type information about these fields somewhere on the schema. I think it can become a bit complex, especially for type-safe languages, also considering that with directives one can already cover the second part (directives already provide the type definitions for this meta-information, so there is no need to introduce a new concept for it)

smolinari commented 7 years ago

I think I know where you are heading with this, and I agree wholeheartedly, however, the solution isn't in GraphQL's own metadata injection system. Trying to extend it to cover more business use cases is the wrong direction. Up to this point, I've heard suggestions on authorization, validation, and of course, the data modeling itself (since it is part of GraphQL it is why so many are looking to GraphQL solutions to solve business domain problems).

I am going to go out on a limb here. The way I see it, Facebook has offered us a really cool way to build a gateway into our data. However, I am almost certain, they are only telling a partial story. I am convinced that they are doing metadata based development, where the metadata is the business logic itself, and GraphQL only offers (to those who should see it) access to that particular kind of data. When I see Lee Byron push back on suggestions like this and others, it is sort of dawning on me that Facebook is coming from another world of thinking and IMHO, it can only be metadata driven development.

Why is metadata driven development good? Because it puts the power of any change to business logic in the hands of the business.

In other words, once the metadata is set and known, then getting the business model (the domain model) together, programatically, is a matter of building it from the metadata. Tools can be offered to non-programmers to change the metadata. The same build-from-metadata goes for GraphQL endpoints too. In other words, metadata is the driver, not GraphQL schema. From the metadata, it would be a matter of translation into definitions for GraphQL, protobuffers, etc. The single source of truth is then the one set of metadata.

So, I guess what I am trying to say is, instead of trying to stuff all kinds of metadata inside GraphQL, we should be thinking about how we can let the metadata drive defining GraphQL schema.

Scott

rmosolgo commented 7 years ago

:+1: I like the idea, I've had half a mind to implement it in Ruby anyways, since the IDL isn't showing signs of ever leaving RFC stage 😆

Thanks for sharing those thoughts about metadata-driven development. That's something interesting to think about, as the Ruby library grows, the Ruby API for schema definition is becoming more a hindrance than a help.

My thought has been to make the GraphQL schema be the metadata. Otherwise I have to invent yet another metadata structure which maps to GraphQL 😖

rmosolgo commented 7 years ago

I worried about portability, since different schema parsers might handle these inputs differently, but I thought I could just include a function to parse a schema then dump it without the custom directives.

smolinari commented 7 years ago

the Ruby API for schema definition is becoming more a hindrance than a help.

Yeah, it seems many people would like to turn their GraphQL system into a "God API", whereas it clearly should only be a relatively simple, but totally awesome gateway into the business logic layer of the application.

My thought has been to make the GraphQL schema be the metadata. Otherwise I have to invent yet another metadata structure which maps to GraphQL.

Yes, but the metadata can be the source of truth for the whole application (or applications), including the API. Think about validation, authorization, workflow, models, and whatever else is business driven. And, your answer tells me you are also still thinking in the wrong direction. The GraphQL API would be modeled after the metadata, not the other way around. 😄

Loopback does something similar to what I am talking about with its "API modeling" according to the modeled data.

Scott

OlegIlyenko commented 7 years ago

@smolinari you brought ups some very interesting points. Though my original intention was more about exposing additional information, rather then a way to aid the data modeling. I would definitely agree, directives indeed expose domain concerns. Even if we generate GraphQL schema based on some other data modeling tool, I think it's still very helpful to be able expose some meta-information via introspection API. Let's stick to this example with a gateway. Recently there was a great video by @bryan-coursera from Coursera on this topic. in particular, I found "Linking the resources" part quite interesting:

https://youtu.be/r-SmpTaqpDM?t=1479

If I understood it correctly, their internal services expose additional meta-information about relations between different models. I think directives can be quite useful in this respect for assembler/gateway service. For example schema of 2 internal services can look like this (I used IDL for demonstration, but it would be accessed via introspection in the actual app):

# "courses" service

type Course {
  id: ID!
  name: String
  subscriber: [ID] @reference(service: "users", rootField: "users", type: "User")
}

# "users" service

type Query {
  users(ids: [ID]): [User]
}

Gateway service then will discover these schemas via introspection API and expose Course type like this (with knowledge on how to resolve it correctly and efficiently using 2 other services):

# "gateway" service

type Course {
  id: ID!
  name: String
  subscriber: [User]
}

When it comes to data modeling, I think GraphQL IDL syntax can be a very good candidate for it. Over the years I saw quite a few tools and formats to declaratively model the data and the domain. Though looks like there is no tool that have seen very wide widespread. I feel that MDD (Model-Driven Development) has it's own set of challenges. I saw it taken quite a bit too far (in quite big production setups) where it becomes real pain to deal with (instead working application code, people a writing generator/tool code which adds additional layers of indirection and frustration). I feel that declarative data modeling fits quite well where the domain itself is very well established and understood.

Recently I saw several examples where GraphQL IDL is used in vary interesting data modeling scenarios. First example is graphql-up. Given this IDL type definition:

type User {
  id: ID!
  name: String
}

It will create a GraphQL schema that will contain User input/output types, relay API to read users and create/update new one, etc. So the IDL that you provide to graphql-up and a GraphQL schema that you end up with are very different. Using GraphQL IDL syntax to model the data in this case (actually any other syntax/language will do the trick in this scenario) has quite a few advantages:

There is already huge amount of tooling available for GraphQL, so it's easy to work with it (especially pragmatically), visualize it and do other interesting things to it
The syntax is familiar and well established, so the learning curve is much shorter, especially considering how nicely it correlates with the end result

Another very interesting adoption of GraphQL IDL syntax is contraband (Contraband is a description language for your datatypes and APIs, currently targeting Java and Scala. It would be part of the next version of scala build tool). As you can see, they adopted the IDL syntax, but changed it in a few interesting ways (including introduction of namespaces, YAY :)).

I see these two examples as a good validation of an idea that GraphQL IDL can be a viable tool for data modeling.

smolinari commented 7 years ago

Though my original intention was more about exposing additional information, rather then a way to aid the data modeling.

I understand. My intention also isn't really about aiding data modelling, but rather automatic generation of the API from a set of metadata. If you have that kind control over the metadata, and the metadata is also persisted in some manner, you can also control as much or little introspection of any of the "view" of any data you want. I realize this is getting quite esoteric, but try to think inside-out or rather, think that the API is something far, far away from a single source of truth. The API should be a window into the application's business layer in that it is only modelled after the domain models, which are (must be) defined elsewhere in the application. I am not saying this translation of metadata is easier, but overall, it is a lot easier than bending the API to all our business needs.

Right now, GraphQL is so cool and allows for so much, it is so flexible, people are starting to want to "model" everything in it, including the logic of what users can introspect. 😉 Whereas, these decisions of what to see or not, (no matter what is being controlled) is basically authorization logic and that is 100% business logic. Thus, it has or should have nothing to do with the internal workings of the API, except that there could be models burnt in metadata for the authorization too, which can also be generated as GraphQL schema, which can be made introspective ( or not, since we'll hopefully be able to generate schema/ the API automatically).

My simple and hard to fathom point is, the single source of truth cannot be the API/ the schema itself. It should only be fashioned after the applications single source of truth, and that is the business/ domain logic.

I know I have butted into similar discussions in other places about this. I might be getting on people's nerves because of it (who are also definitely loving GraphQL and its scene/ community). So, I think I've clarified my point as best I can here. I'll bow out now and let the conversation continue. Just let me warn everyone that making the API "too smart" is dumb and unnecessary. The hard work needs to go somewhere else in the depths of the server stack, which in the end, will make working with GraphQL overall, much easier. 😄

Scott

OlegIlyenko commented 7 years ago

@smolinari thanks a lot for a very detailed explanation! I think I can now better understand your point. I would definitely agree with it, there is much more to business logic of an application than what API can/should expose. I think it's also a good point to keep in mind as discussion around this issue progresses.

wincent commented 7 years ago

Interesting discussion. Thanks for starting it @OlegIlyenko. As you know, the role of directives as currently defined in the spec is pretty narrow; they are intended to:

[D]escribe alternate runtime execution and type validation behavior in a GraphQL document.

Exposing them via introspection (beyond __schema { directives { ... } }) would be a pretty large extension which we would want to evaluate carefully. My initial instinct is that exposing them like this would be overloading their purpose in a way that would increase the conceptual burden in an undesirable way, and I'd like to see some more exploration of specific use cases where having schema directives exposed via introspection would make things that are currently very difficult (or impossible) to do via other means significantly easier (or possible).

@OlegIlyenko: for example, you mentioned "field cost/complexity". Can you tell us more about that? We've certainly built tooling around that internally at FB, but it exists outside the schema (consuming the schema, developer queries/fragments, and runtime metrics as inputs).

IvanGoncharov commented 7 years ago

Expose IDL directive information via introspection API

@OlegIlyenko IMHO, IDL word in the title makes people think that the only way to expose this meta-information will be defining it inside IDL document. But nothing prevents you from specifying applied directives if you define the schema in the source code (with support from the server-side lib). So how about renaming it to:

Expose values of applied directives via introspection API

or something similar?

My initial instinct is that exposing them like this would be overloading their purpose in a way that would increase the conceptual burden in an undesirable way

@wincent I think it's a good solution to spec bloat. For example, according to the graphql-js implementation, you can deprecate field by using @deprecated directive, but in introspection, it is exposed through isDeprecated and deprecationReason fields. That means if I decide to have something like @deprecationDate I am forced to define new fields inside introspection, e.g. deprecationDate. The only way to safely achieve this will be pushing such directives and fields into the spec and this will lead to spec bloat.

To sum it up: GraphQL introspection should support mechanism for vendor extensions inside introspection and exposing applied directive values is a good solution for that.

I'd like to see some more exploration of specific use cases where having schema directives exposed via introspection would make things that are currently very difficult (or impossible) to do via other means significantly easier (or possible).

Here are a few examples from the top of my head:

@localizeName for enum values. I like that spec is limiting such names to ASCII but at the same time, there should be a possibility to specify localized name and use them on the client.
@relayMaxSliceSize which specify maximum number you can pass to first/last. It will allow implementing zero-config pagination
@examples for field arguments which can be used to generate better documentation (e.g. show them somewhere in graphiql when you type field arguments)

calebmer commented 7 years ago

@OlegIlyenko have you considered introducing only a single directive in the IDL that maps well to introspection that would allow users to provide metadata? Something like @metadata. Users could then define (or extend) a __FieldMetadata type, or __FieldMetadata could be a scalar which accepts any JSON object. This could be represented in the IDL as:

type __FieldMetadata { important: Boolean }
# Or...
scalar __FieldMetadata

# We may also have a `__TypeMetadata` perhaps.
directive @metadata(field: __FieldMetadata)

type User {
  id: ID!
  name: String
  admin: Boolean! @metadata(field: { important: true })
}

(I may be getting the directive syntax wrong, feel free to edit this comment if it is wrong)

Or in the introspection query this would be modeled as:

{
  __type(name: "User") {
    fields {
      metadata { important }
    }
  }
}

This balances the need for attaching metadata to a GraphQL schema with the desire to not introducing special behavior around all directives in the IDL.

OlegIlyenko commented 7 years ago

@wincent

would be a pretty large extension which we would want to evaluate carefully

I definitely agree with this! Seeing all these great comments made me think a lot about the concept and it's soundness :) Now I discovered some new interesting perspectives on it.

you mentioned "field cost/complexity". Can you tell us more about that?

assuming that complexity calculation is a simple and static algorithm (like the one I used), it can be replicated in a client-side tooling given that the information about complexity of individual fields is available in some way (ideally though the introspection API).

This feature saved us already several times from unintentional expensive client queries. But when we start a dialog about why query was rejected by server and what query complexity/cost means, people always get confused since from a client perspective it's hard to predict (at least in more non-trivial cases) the overall complexity of the query in advance without communicating to the server (and then tweak it in order to ensure that complexity is below the threshold). I believe that by making this information more transparent we can avoid a lot of confusion around complexity estimation and help developers to write efficient queries. If this information is available though the introspection API, then the complexity calculation can be implemented as query validation rule which then can be used by existing linting tools (no modification is necessary to the tool itself). If we take this idea even further, one can develop a GraphiQL plugin that shows complexity of individual fields and field + nested selection set on mouseover. I think these kind of insights will be very helpful to client and server developers.

this would be overloading their purpose

I also share this concern. I think directives are convenient since after this change it would very easy to fully represent an introspection data in IDL syntax. I'm open to different syntax/encoding of this information. My main goal in this particular issue is to prove/disprove my hypothesis that it is useful/beneficial to expose user-defined meta-information via introspection API and benefits are worth added complexity. I just thought that it would be helpful to have some concrete syntax in examples.

@IvanGoncharov

It's an excellent point about deprecation! I haven't thought about it in this way, but now that you mentioned it, it makes perfect sense. Also if we want to, for instance, add a deprecation feature on other things, we can just update the directive itself without any changes to the introspection API. E.g.:

- directive @deprecated(reason: String) on FIELD_DEFINITION | ENUM_VALUE
+ directive @deprecated(reason: String) on FIELD_DEFINITION | ENUM_VALUE | OBJECT

I also like your other examples. I think they all are valid use-cases. Totally agree about the title, I think it caused quite a bit of confusion. I updated it to better reflect the original motivation.

@calebmer

I think it is an interesting idea and definitely worth considering. Though I personally would prefer not to mix disjointed concerns in a single type. With this approach we can end up with type like this one:

type __FieldMetadata {
  localizedName: LocalizedString
  complexity: Float
  example: String
}

I would rather prefer to see these as independent entities (like with the directives). This will also require introduction of 11 new types (__FieldMetadata, __EnumMetadata, __EnumValueMetadata, __ScalarMetadata, etc.).

calebmer commented 7 years ago

@OlegIlyenko why would you not want to mix disjointed concerns in a single type? There are many ways to design the type to make mixing less confusing. Also, how is the example you gave for __FieldMetadata fundamentally different from using directives?

Also, if you think 11 new types is a bad thing (I don’t necessarily think so) then the specification could make the metadata types optional. We could also combine all metadata into one type: __Metadata.

The point is I agree that the ability to expose arbitrary metadata in both the JSON and IDL introspection formats is incredibly useful, but overloading directives may not be the right solution 😉. Is there some other directive-like syntax that could accomplish the same thing?

IvanGoncharov commented 7 years ago

why would you not want to mix disjointed concerns in a single type?

@calebmer Because you can't easily reuse different types.

In your scenario, a user needs to explicitly define __Metadata type with all fields in it and maintain it in sync so it provides fewer incentives for reusing existing metadata conventions.

On the other hand, let's take two directives from my previous post: @localizeName and @relayMaxSliceSize. You just need to append directives definition to the schema either in form of IDL or GraphQLDirective objects. Moreover, we can write a tool that detects directive usage and append appropriate definition automatically.

My main requirement for "Expose user-defined meta-information via introspection API" is to allow for flexibility but at the same time encourage people to reuse conventions.

Also, one technical issue with __Metadata type: It makes impossible to get introspection via static query since you don't know its fields in advance. So you first have to make query __type(name: "__Metadata") and only then form dynamic queries with all fields.

Here are additional arguments for using directives to expose metadata to the client:

Directive definitions are already exposed through introspection
directives can be tied to specific location
if you already use a directive to alternate runtime execution and type validation you can expose them to the client side. For example, if you have server-side validation you can use the same directives to power client side validation. So directives are the only way to configure server and client at the same time without duplication.

OlegIlyenko commented 7 years ago

@calebmer I definitely think __Metadata should be considered a valid alternative. Though I tend to agree with @IvanGoncharov's arguments. So it has it's own set of advantages and disadvantages, like any other approach. I guess it will boil down to a question which tradeoffs we a willing to take.

I also played with other ideas for a syntax. Maybe placement of a directive may decide whether it is exposed via introspection or not (usage side):

type User {
  id: ID!
  name: String

  @deprecated
  admin: Boolean! @important
}

Or allow directive to be exposed at a definition side with a modifier keyword like public or exposed:

exposed directive @deprecated(reason: String) on FIELD_DEFINITION | ENUM_VALUE

Another idea is to introduce new concept, like annotations. Syntactically it would similar to directives, but will provide better identification that these 2 things are meant for different purposes. Though I don't really like this idea that much, it adds too much complexity.

@wincent I was thinking about the directive spec definition for a while now:

[D]escribe alternate runtime execution and type validation behavior in a GraphQL document.

I would argue that @deprecated directive already deviates from this definition. Although it influences how schema is generated and can be used to validate a query against the schema, it's main purpose is to expose additional structured information about a field or enum value definition.

I guess it is just different perspective on looking at the same thing. I would rather define a directive as a way to provide additional structured information on different GraphQL entities. Server and clinet runtime then can take advantage of this information in different interesting ways (not only in terms of query execution or validation. These two are just valid use-cases). In fact, this is what spec defines as well:

Directives can be used to describe additional information for types, fields, fragments and operations.

So I feel that using directives in this context does not violate the spec definition.

felixfbecker commented 7 years ago

I think it would be awesome to have this because it would allow the community to experiment with solutions to unsolved problems in GraphQL before/without putting it into the spec. For example, we could try out a directive to annotate what errors a field can cause, and then codegen tools can use that information.

gjtorikian commented 6 years ago

So I feel that using directives in this context does not violate the spec definition.

~~The annoying bit of this is that directives, at this time, cannot be applied to arguments, which means they cannot be given metadata in this way.~~ Not true! The IDL spec says yes, but I think this information is missing from the GraphQL spec.

yordis commented 6 years ago

Related to this is #376, where I basically need some soft of tagging for the mutations like @group('order') createGroup : MutationResponse so GraphiQL tool could do some grouping in the Doc section

taion commented 6 years ago

I'd love to have this as well. While I understand that, per https://github.com/graphql/graphql-js/pull/746#issuecomment-301554231, schema directives just seem like a very natural way to attach this sort of metadata to fields.

Actually, though this was a misreading on my part, I actually found it somewhat surprising that things didn't already work this way.

It seems like the sort of thing that should "just work".

kaqqao commented 6 years ago

It is important to not just blindly expose the existing directives through introspection as that would suddenly make them unfit for storing anything sensitive/internal, like security roles, permissions etc, which seems to be a common use-case in the wild.

Of course, there's various suggestions listed here that would work just fine. My intention was only to clearly state a concern.

taion commented 6 years ago

Yup, makes total sense. I think it wasn't totally obvious to those of us coming from the side of using programmatically constructed schemas that people used directives for that purpose via the SDL. Some different syntax or a special carve-out is a must.

felixfbecker commented 6 years ago

What do you mean by “existing directives”? There is only @deprecated atm

kaqqao commented 6 years ago

One is allowed to invent and use any directive they want. And a common use-case is a directive for authorization, akin to @auth(role:’manager’). Here's an example, and another one. Simply exposing any directives present in the schema is hence rather dangerous.

taion commented 6 years ago

Does anyone own this proposal, incidentally? The options I've seen floating around have looked like:

exposed directive @foo(...) on ...

decorator +foo(...) on ...

Or have a special @meta directive that is exposed...

And possibly with @deprecated getting merged into one of the above.

@xuorig I see that per these meeting notes https://github.com/graphql/graphql-wg/blob/27d27dbe7884c8c54798b3812b1076f3e7cde253/notes/2018-02-01.md#exposing-schema-metadata that you were looking at this. Did that lead to a concrete proposal?

edalquist commented 6 years ago

Just another voice supporting the ability to customize the introspective types of the schema. I've played around a bit and personally and it seems like the most flexible approach.

rmosolgo commented 6 years ago

FWIW I added the option in GraphQL-Ruby to extend the introspection types: http://graphql-ruby.org/schema/introspection.html#customizing-introspection

But as far as I know, nobody has done it yet :P

edalquist commented 6 years ago

That is very similar to a local patch to for the Java GraphQL library I have. It re-writes the Introspection class to contain a bunch of static methods that provide Builder instances. These are then made available during Schema building and there is an extension to Schema that provides references to the compiled type and schema meta fields as they exist for that specific Schema instance.

It is a fairly minimal change but then allows users of the graphql-java library to customize the schema introspection data as they see fit. Is this something I should look at sending over to graphql-java as a pull request?

jacklaaa89 commented 6 years ago

I think even a list of the names of the directives attached to the element would be helpful so it would be possible to query the schema for more detailed information about that directive. So the schema:

directive @important on FIELD_DEFINITION

type User {
  id: ID!
  name: String
  admin: Boolean! @important
}

type Query {
  user: User
}

We could update the definition of __Field on include a list of names of the directives attached to that field.

extend type __Field {
  directives: [__Directive!]
}

query {
    __schema {
         types {
             fields {
                  name
                  kind
                  directives {
                        name, args
                  }
             }
         }
    }
}

komkanit commented 5 years ago

@jacklaaa89 I cannot extend type __Field do you have any code example?

benjie commented 5 years ago

@komcal This is a proposal for the GraphQL specification itself. Types, fields, etc that start with a double underscore (__) are reserved for the GraphQL introspection system and can not / should not be modified in a user schema: https://facebook.github.io/graphql/draft/#sec-Reserved-Names

kaqqao commented 5 years ago

@edalquist Please do contact the graphql-java team about your idea.

inakianduaga commented 5 years ago

I just run into this thread while working on a schema stitcher "gateway" API that stitches different APIs and has to handle authorization. So far these API's were handling authorization / authentication on their own, however we want to move that layer to the stitcher in front of them. To do so without maintaining internal knowledge of the APIs in the stitcher, we need to be able to attach information to each field in the upstream APIs so we know what permission scope is required to access each field. Directives would be the perfect way to do this, alas the information is lost when introspecting. I don't know if we'll find a workaround, but basically agree that there should be a way to attach metadata to fields, and directives are already doing that, if only they could be exposable

kaqqao commented 5 years ago

@inakianduaga How would you prevent any random client from reading your authorization rules?

inakianduaga commented 5 years ago

How would you prevent any random client from reading your authorization rules?

Client -> GraphQL stitcher     -> Upstream API 1
                               -> Upstream API 2
                               -> ...

Clients only talk to the stitchers. Each upstream API can tag whatever nodes it wants from its schema with a scope value. The stitcher itself has the information about what permissions the user is allowed (via JWT or whatever means) and applies this restrictions programatically on the stitched schema without needing to know any details about the upstream APIs or synchronise any definitions.

This allows you to do better than simple authorisation, since you can completely hide the nodes a client is not allowed to see even when they perform an introspection. That way for the clients it's WYSIWYG, meaning everything they see on the schema is requestable, and they can't see what they are not allowed to request.

IF directive information where exposable, like this ticket wants, implementation would be straightforward

benjie commented 5 years ago

I think in @inakianduaga’s system access to the APIs behind the gateway would be blocked, so the information would not leak - when stitching these auth hints would be “consumed”.

An alternative approach could be to add an explicit “authMeta” root-level field that contains the auth information, and just ensure this is dropped while stitching. It’s definitely preferable to locate this information local to each GraphQL type though!

kaqqao commented 5 years ago

@inakianduaga I understand for your specific example, but I was thinking of the general case. Wouldn't it become necessary to always have a wrapper schema (to prevent security info leakage) if directives were introspectable?

benjie commented 5 years ago

I think the directives would have to add themselves to introspection. By default directives should not be exposed (so as to respect backwards compatibility). Really what we’re talking about here is extending the introspection types and adding additional fields containing custom metadata; directives are just a convenient pre-existing way to express these extensions via SDL.

Tehnix commented 5 years ago

@inakianduaga How would you prevent any random client from reading your authorization rules?

I definitely understand your concern, but it doesn't matter, for us, if the client knows which resources are protected and which are not.

For example, we tag our queries and mutations with a custom @iam(key: "Stops.Create") directive. The user can do absolutely nothing with this knowledge, but we are able to enforce it various places. In fact, we want this information to be available for the developers introspecting the schema, so they know what things they need to check if the user has a feature for or not.

If you consider this leakage of information (security levels), then HTTP 403 is equally leaking, since it also tells you that something exists, but requires additional permissions.

Now, if you are putting confidential information into your directives, then yes, that is a problem.

By default directives should not be exposed (so as to respect backwards compatibility). Really what we’re talking about here is extending the introspection types and adding additional fields containing custom metadata; directives are just a convenient pre-existing way to express these extensions via SDL.

As @benjie suggests, I also think it should be an explicit action to expose a directive. E.g. in the server config, specify the directives that are exposed. This would alleviate the concerns about leakage, since it's now an explicit actions.

As an example, we are currently patching graphql-js in our own fork to enable us to use directives for IAM in our stitched schema.

nodkz commented 5 years ago

@Tehnix can you remove build files from your path? 17k lines in your diff 🤪

kaqqao commented 5 years ago

Certain implementations , e.g. graphql-java, enable you to dynamically decide what fields and even arguments are visible, both to introspection and in general. If this capability was to be standardized and extended to directives, all security related directives could themselves have access rules applied to their own visibility.

This way schemas that are never exposed directly (like the upstream schemas in your example) can have no access rules, while client-accessible schema can utilize them to prevent security related directives (or any other schema element for that matter) from being visible to introspection, and generally.

These two (introspectable directives, and dynamically controlled visibility of all elements) put together sounds like the combo that can cover any use-case.

Tehnix commented 5 years ago

@Tehnix can you remove build files from your path? 17k lines in your diff 🤪

@nodkz We're patching it in yarn post-install where we pull in the library, so we need to patch the whole dist as well :) That is, we have the regular graphql-js at a matching version in our dependencies, and then we apply our patch afterwards, meaning we work with the build files.

If you only want the source patch, you can generate it with e.g. (notice that it now looks at src/*, whereas our own patch looks at dist/*),

$ git diff b14.0.0-rc.2 master src/* > add-iam-directive-14.0.0-rc.2.patch

benjie commented 5 years ago

(ASIDE: You can mark the files as generated in .gitattributes and then GitHub will hide their contents by default.

dist/** linguist-generated=true

https://help.github.com/articles/customizing-how-changed-files-appear-on-github/ )

victorandree commented 5 years ago

As far as the GraphQL specification is concerned, wouldn't it be sufficient to allow extending the Schema Introspection types, either with arbitrary fields or under a "safe" extensions field? Allowing a field extensions would be in line with how GraphQL allows custom fields on errors and the response map (this was noted by @IvanGoncharov in https://github.com/graphql/graphql-spec/issues/543#issuecomment-462193626). The Schema Introspection section could be amended to simply reserve a field named extensions:

Any type of the GraphQL schema introspection system can provide a field with name extensions. This field is reserved for implementors to extend the introspection schema however they see fit.

As @IvanGoncharov noted in the previously mentioned https://github.com/graphql/graphql-spec/issues/543#issuecomment-462193626, "two-stage introspection" could be used to introspect what can be introspected on a target schema.

A concrete example would be to support the directives used in Apollo's GraphQL Federation, by adding the relevant metadata to __Type (for @keys and @extends) and __Field (for @external, @requires and @provides). This issue is under discussion under https://github.com/apollographql/apollo-server/issues/2769

scalar _FieldSet

type _TypeExtensions {
  keyFields: _FieldSet
  isExtension: Boolean!
}

extend type __Type {
  extensions: _TypeExtensions
}

type _FieldExtensions {
  isExternal: Boolean!
  requiresFields: _FieldSet
  providesFields: _FieldSet
}

extend type __Field {
  extensions: _FieldExtensions
}

If you require this metadata, you can do an initial introspection to see if the target supports it:

{
  Type: __type(name: "__Type") {
    fields {
      name
      type {
        kind
        name
      }
    }
  }

  Field: __type(name: "__Field") {
    fields {
      name
      type {
        kind
        name
      }
    }
  }
}

Once you know that your introspection schema supports the relevant fields, you run an extended introspection query.

{
  __schema {
    types {
      name
      extensions {
        keyFields
        isExtensions
      }

      fields {
        extensions {
          isExternal
          requiresFields
          providesFields
        }
      }
    }
  }
}

For those who want to expose "all directives" or arbitrary metadata, simply extend the introspection schema to support it.

benjie commented 5 years ago

I really like this idea - it's simple and powerful. I think the _TypeExtensions and _FieldExtensions would be arbitrarily-named user-defined types (hence no __ prefix, which agrees with what @victorandree has written) which would be supplied to the GraphQL schema itself (via the schema keyword in SDL, e.g. schema { query: Query, typeExtensions: _TypeExtensions, fieldExtension: _FieldExtensions }, or via the GraphQLSchema constructor in GraphQL.js).

victorandree commented 5 years ago

Using user-defined types for introspection schema extensions has two downsides, compared to allowing extend type __Type directly, however:

You'd actually have to introspect __Type to figure out what the user-defined extensions type is called, then introspect it to figure out what the extensions are, then do the extended introspection on the types themselves.
If you have multiple extensions from different sources – say one for authentication and one for federation – it's not obvious how you'd merge the different types under one field. However, both sources would know to extend type __Type since it's well-known.

This could be managed by every introspection schema type having an extensions field with a well-known name (e.g. __Type always has extensions: __TypeExtensions); or providing some other guarantee that additional fields on __Type et cetera wouldn't clash with a future GraphQL introspection schema (user-defined field must have some prefix, for example).

Added 2019-06-24: Object types must define one or more fields to be valid (see point 1 under Type validation: "An Object type must define one or more fields."), so the spec providing type __TypeExtensions without any fields would not be allowed under the current spec (but see https://github.com/graphql/graphql-spec/issues/568).

VladimirAlexiev commented 5 years ago

Here's another simple use case.

For each field and object in a GraphQL schema, we want to split off label and descr so we can use the label as field label in some UI, and show the descr in a tooltip.

We've modeled it like this:

directive @descr(_:String!) on FIELD_DEFINITION | OBJECT
"ID" x_id: String @descr(_:"Identifier in source dataset. Single-value, optional")

However, we can't get @descr using introspection because __Field doesn't include directives. For this example it returns only this:

"name": "x_id",
"description": "ID"

Despite the discussion above that directives may be internal details of a server, I find this strange because __Directive definitions are included in introspection.

benjamin-rood commented 4 years ago

However, we can't get @descr using introspection because __Field doesn't include directives. For this example it returns only this:

@VladimirAlexiev

I've had to do something similar recently. Using graphql-tools process' of extending VisitSchemaDirective, and implementing the appropriate method (depending on what type of *_DEFINITION the directive is applied to), I injected the name of the field the directive was on and the value of any arguments into the GraphQLResolveInfo object in the appropriate shape.

VladimirAlexiev commented 4 years ago

@benjamin-rood We did a similar extension in the Ontotext Platform: http://platform.ontotext.com/tutorials/graphql-introspection.html Eg in "what fields are available for an object", the standard introspection query

{
  __type(name: "Human") {
    name
    fields {
      name
      type {
        name
        kind
      }
    }
  }
}

returns eg this, although the "directives" payload was not called for:

      "fields": [
        {
          "name": "id",
          "type": {
            "name": null,
            "kind": "NON_NULL"
          },
          "directives": {
            "@descr": {
              "_": "Single, mandatory. Each RDF node has exactly one IRI."
            }
          }
        },

graphql / graphql-spec

Expose user-defined meta-information via introspection API in form of directives #300