[RFC] Add a complete set of GraphQL Document examples for stress testing parsers

solomon-b commented 2 years ago

TL;DR

It would be really nice to have a set of GraphQL Documents which collectively express the entire GraphQL grammar. This would include atypical constructions such as random insertions of comma characters as whitespace.

Problem Statement

The GraphQL query language is well specified with a formal grammar but is large enough that there can be subtle edge cases when constructing parsers, especially when the parser is hand written. The language also continues to be extended and library developers must stay on top of changes to the spec.

Proposed Solution

In addition to the language spec, we should providee an official collection of GraphQL Documents which covers the entire grammar constructed in varying levels of term complexity.

This set of `Golden Documents` should also contain examples which attempt to trip up the parser implementation. For example, inconsistent use of commas or extra commas, complexities around block strings, and ambiguous parses such as:

type Query

{ foo: bar }

For every `Golden Document` representing a valid term we would provide a graphql-js AST. For those representing invalid terms we would provide the errors in GraphQL format.

Between the GraphQL Spec examples, the graphql-js test suite, and the nim-graphql test suite we have a large supply of test values. In addition I would also like to build a set of terms based on the chinook-database generated from the Hasura GraphQL Engine.

Such a collection of Documents would be extremely useful for golden testing GraphQL parsers and would help establish a higher standard for parsers across implementation languages.

benjie commented 2 years ago

Are you willing to do the work/be the champion for this? If so you're welcome to add it to an upcoming agenda e.g. https://github.com/graphql/graphql-wg/blob/main/agendas/2022/2022-07-07.md but to be honest I think it's pretty obvious this is valuable (there have been attempts in the past, such as graphql-cats) so I would just create a repository for it and get to work - once you have sufficiently many test cases to be valuable you can open a discussion in the WG regarding moving the repo under the GraphQL umbrella.

Regarding code location, I don't think it needs to be in the spec repo itself and there's a high bar for getting PRs into this repo. A separate "regression tests" repo could be updated and expanded with much less oversight allowing for faster iteration and more thorough coverage :+1:

I'd be happy to help with any guidance you may need - feel free to reach out to me via the GraphQL Discord https://discord.graphql.org

solomon-b commented 2 years ago

@benjie Yes I would be happy to get the ball rolling on this. I think the bulk of the work is fairly mechanical and the difficulty will really be in making sure we are able to identify all the atypical cases. At that point getting help from you or anyone else more involved in the language spec would be tremendously helpful.

I hadn't seen graphql-cats but glancing over the repo it clearly had a much larger scope then this project which may have slowed down velocity. The nice thing about writing these parser stress tests is that they have a clear scope and can be pinned to the spec versions.

benjie commented 2 years ago

Yep; 100% agree. @IvanGoncharov is pretty good at knowing the parser edge cases, and you should also check out the parser tests in graphql-js:

https://github.com/graphql/graphql-js/blob/main/src/language/__tests__/parser-test.ts https://github.com/graphql/graphql-js/blob/main/src/language/__tests__/schema-parser-test.ts

IvanGoncharov commented 2 years ago

We also have kitchen sink documents: https://github.com/graphql/graphql-js/blob/main/src/__testUtils__/kitchenSinkQuery.ts https://github.com/graphql/graphql-js/blob/main/src/__testUtils__/kitchenSinkSDL.ts

Also, block strings are the most non-trivial part of our parser so we have a separate test for them: https://github.com/graphql/graphql-js/blob/main/src/language/__tests__/blockString-test.ts

Previously we discussed the possibility of extracting those tests into YAML files (since it support human-readable multiline strings) and even distributing them as zip arcive through GitHub releases.

One problem with this approach is that we need to ensure that documents are correctly parsed. For example, the following document is correct:

type Query

# some text
{
  foo: bar
}

But it can be parsed either as "empty type" + query or as a type with one field (correct one).

Even negative tests will have the same problem, for example

""""""" # 7 double quotes
type Query

Parser can correct fail here:

"""""""
      ^

because block string is followed by quoting mark (correct one).

Or it can fail here:

"""""""
  ^

because parse doesn't support block strings at all and thinks it's string followed by another string.

In previous discussions, I advocated for just storing graphql-js AST for positive tests and errors in GraphQL format for negative tests. So basically, I propose converting many existing graphql-js tests into YAML files.

@solomon-b Do you think this proposal fits your usecase?

Regarding code location, I don't think it needs to be in the spec repo itself and there's a high bar for getting PRs into this repo. A separate "regression tests" repo could be updated and expanded with much less oversight allowing for faster iteration and more thorough coverage 👍

Currently, we have 2 repos involved in the RFC process (graphql-spec and graphql-js) adding a third one will increase the coordination burden. Also, the reference parser should pass all the "regression tests" otherwise we create confusion so PRs to the "regression tests" repo would be blocked on "graphql-js" PRs.

Given that I suggest just having it in a top-level folder of graphql-js.

solomon-b commented 2 years ago

@IvanGoncharov I think you hit on the core issue with this approach. We can provide test cases but we cannot provide valid graphql ASTs for someone else's project.

Providing a Graphql-js AST and GraphQL format would be better then nothing but would still require parser authors to write a serialization function to map their AST to JSON so that they can compare against our provided terms. There will also be more difficulty with asserting on our provided errors because the error messaging is unspecified.

Problems aside, I think adding these parse results would be a good idea.

Given that I suggest just having it in a top-level folder of graphql-js.

I don't have any opinion on where these files be stored so long as they are easy to find for anyone looking up the graphql spec.

solomon-b commented 2 years ago

type Query

# some text
{
  foo: bar
}

Is it not the case that FieldsDefinition must be a non-empty list making this have only one valid parse as a type with one field?

benjie commented 2 years ago

I don't think it's necessarily critical to use the ASTs for other projects - if they want to match the GraphQL.js AST then that's all well and good, but otherwise having the result of print(document) match should be sufficient; e.g. indicate the input and some kind of canonical output:

input: >
  """Description"""
  query A # comment
  ($foo:
      Int!,,,,,) { __typename,,,}
canonical: >
  """Description"""
  query A ($foo: Int!) {
    __typename
  }
ast: # ...

Regarding your question; the following are both parseable GraphQL documents:

type Query

and

# some text
{
  foo: bar
}

The latter in this context is an ExecutableDocument containing an anonymous operation, the former is an ObjectTypeDefinition.

This is a parseable GraphQL document:

type Query

# some text
query {
  foo: bar
}

It's an ObjectTypeDefinition followed by an OperationDefinition.

This however, omitting the optional query keyword, is parsed as a completely different GraphQL document:

type Query

# some text
{
  foo: bar
}

This is due to the negative-lookahead on ObjectTypeDefinition failing, and so what was previously an OperationDefinition is now instead a FieldsDefinition. So yes, the correct result is that it has only one valid parse, but the reason behind that is slightly different.

solomon-b commented 2 years ago

This is due to the negative-lookahead on ObjectTypeDefinition failing, and so what was previously an OperationDefinition is now instead a FieldsDefinition. So yes, the correct result is that it has only one valid parse, but the reason behind that is slightly different.

Thanks, these are precisely the types of subtleties I hope we can clarify through this project. i'm pretty busy with work stuff at this very moment, but I would like to put together a slightly more detailed proposal and then add it to an upcoming working group agenda. Even if I can't dive into the project right away, i would like to document the plan of attack so that we can come back to it quickly in the near future.

jangko commented 2 years ago

we have extensive collection of test cases in nim-graphql repo. please take a look at this tests folder. subfolder validation targeting the parser and schema validator, while subfolder execution targeting execution engine and introspection system.

validation\security.toml is a collection of test cases found during fuzz testing.

in total we have 500+ individual test cases to prevent regression. what this collection lacks are scrutiny and audit.

solomon-b commented 2 years ago

Thank you @jangko. It would be really helpful to use those test cases in this project.

I've written an updated RFC which describes what we have all discussed. I am going to add this proposal to the agenda for August.

@benjie should I edit the original message here to my updated RFC or just throw it in a gist?

benjie commented 2 years ago

Updating this one is fine 👍

graphql / graphql-spec