graphql / graphql-js

A reference implementation of GraphQL for JavaScript
http://graphql.org/graphql-js/
MIT License
20.07k stars 2.02k forks source link

Create more flexible context-specific validation model #719

Open wincent opened 7 years ago

wincent commented 7 years ago

Extracting @leebyron's comment from https://github.com/graphql/graphql-js/pull/717:

I think beyond this we should follow-up with with a slightly more invasive change to separate the construction of a schema from the validation of a schema.

When building a "source of truth" schema (e.g. you run a node.js server, and use GraphQL.js to present a GraphQL service) then we should encourage validation - strict validation - to ensure the schema is well formed. I'm not sure what the best way to accomplish this would be and am open to ideas. Perhaps execute() or other operations dependent on a valid schema would check some internally-set flag for "has been validated", and if it's not set, run validation? Just an idea.

When building a "reference" schema (e.g. you used introspection to learn about a schema, and have rebuilt it to power a tool like GraphiQL) then we have very different validation needs. Assumptions made by execution like an object adhering to its implemented interfaces, or (in this case) names being well formed may not break assumptions of the tools being used - we're both paying a computation cost to perform this validation and we're forcing tools to be more strict than they might need to be. Should GraphiQL stop working or report errors if it's used against a server that has an invalid schema? If it should, how do we define that line?

This diff is a great incremental step to easing the pain of this particular change, but it would be awesome to see a broader rethink of how schema validation fits into the whole story.

Adding the "discussion" label.

asiandrummer commented 7 years ago

Just throwing my $.02 - If I understood the schema validation step correctly (that is, I'm assuming there isn't a validation rule that checks something other than the integrity of the schema), then I think an invalid schema should never exist. #717 and related touch on a specific situation where we try invalidating what was a valid schema before by enforcing a restriction to the schema generation - I feel like this should be treated as an edge case and probably something we'd need to consider in addition when we'd like to add more restrictions to the specifications/implementations later on.

GraphiQL is an interesting use case because it probably needs both variations of schema concepts - "source of truth" schema to execute a query and "reference" schema to power IDE-like features without the said computation cost. With that said, I feel like we'd be walking a dangerous zone if GraphiQL is allowed to interact with an invalid schema.

Another thought I had was the possible discrepancy between two variations of the schema. Is it possible to have a "reference" schema that is different (e.g. has invalid new/modified field/types) from the "source-of-truth" schema?

OlegIlyenko commented 7 years ago

I think it would a very good addition to the library 👍

Just would like to share my experience with validation rules in sangria. When I first started working on schema validation, I introduced a rule-based mechanism similar to the query validations. After some time I removed it in favor of in-place validations mostly because it felt like an overkill for a very small set of rules that I had at the beginning.

After some time the number of validations became much bigger and the validations themselves more complex, so I decided to re-introduce the rule-based mechanism I had at the beginning (though I haven't refactored all validations yet):

https://github.com/sangria-graphql/sangria/blob/master/src/main/scala/sangria/schema/SchemaValidationRule.scala#L19-L24

This makes it really easy to organize and maintain these rules, so I definitely can recommend this approach. It also has another big advantage - user of the library can pick and choose validations they would like to execute against a schema. For example, with recent restriction on double underscore (__) names I implemented IntrospectionNamesValidationRule. If a user updated the library, but still relies on custom __* names, they can just exclude this rule when they are defining the schema.

I personally find the solution introduced in graphql-js to deal with this issue quite inflexible (a console warning). Users that just start with the library probably don't want to use __* names at all and would rather have an error/exception instead of a warning, which is easy to miss. Users who already rely on __* names for their types and fields probably already know about it and will migrate eventually or can't do anything about it.

I think it becomes even more important when a schema is generated dynamically based on some external definition (possibly coming from DB). Meanwhile, there are quite a few services that rely on this (like graphcool or scaphold). I myself work on project that has parts of the schema generated based on user-defined data structures that are coming from DB. For these services, schema definition and names are validated beforehand elsewhere and one can just disable schema validation entirely to save on CPU cycles (this can be very useful because in some scenarios schema is loaded and re-created for every single request). Also when we look at migration process for these services, rule-based schema validation becomes quite valuable:

I also agree that tools like GraphiQL should not enforce schema validation rules unless it is necessary for their functionality. For example, I can imagine scenarios where people may relax name character set restrictions on their own risk. They still would be able to use GraphiQL with disabled name validation rule but apollo-ios code generator probably will include it since it requires proper names for swift codegen. That said, if GraphiQL does need specific name pattern for, let's say, autocompletion, then it probably better to keep the validation rule.