Open solomon-b opened 2 years ago
Are you willing to do the work/be the champion for this? If so you're welcome to add it to an upcoming agenda e.g. https://github.com/graphql/graphql-wg/blob/main/agendas/2022/2022-07-07.md but to be honest I think it's pretty obvious this is valuable (there have been attempts in the past, such as graphql-cats
) so I would just create a repository for it and get to work - once you have sufficiently many test cases to be valuable you can open a discussion in the WG regarding moving the repo under the GraphQL umbrella.
Regarding code location, I don't think it needs to be in the spec repo itself and there's a high bar for getting PRs into this repo. A separate "regression tests" repo could be updated and expanded with much less oversight allowing for faster iteration and more thorough coverage :+1:
I'd be happy to help with any guidance you may need - feel free to reach out to me via the GraphQL Discord https://discord.graphql.org
@benjie Yes I would be happy to get the ball rolling on this. I think the bulk of the work is fairly mechanical and the difficulty will really be in making sure we are able to identify all the atypical cases. At that point getting help from you or anyone else more involved in the language spec would be tremendously helpful.
I hadn't seen graphql-cats
but glancing over the repo it clearly had a much larger scope then this project which may have slowed down velocity. The nice thing about writing these parser stress tests is that they have a clear scope and can be pinned to the spec versions.
Yep; 100% agree. @IvanGoncharov is pretty good at knowing the parser edge cases, and you should also check out the parser tests in graphql-js:
https://github.com/graphql/graphql-js/blob/main/src/language/__tests__/parser-test.ts https://github.com/graphql/graphql-js/blob/main/src/language/__tests__/schema-parser-test.ts
We also have kitchen sink documents: https://github.com/graphql/graphql-js/blob/main/src/__testUtils__/kitchenSinkQuery.ts https://github.com/graphql/graphql-js/blob/main/src/__testUtils__/kitchenSinkSDL.ts
Also, block strings are the most non-trivial part of our parser so we have a separate test for them: https://github.com/graphql/graphql-js/blob/main/src/language/__tests__/blockString-test.ts
Previously we discussed the possibility of extracting those tests into YAML files (since it support human-readable multiline strings) and even distributing them as zip arcive through GitHub releases.
One problem with this approach is that we need to ensure that documents are correctly parsed. For example, the following document is correct:
type Query
# some text
{
foo: bar
}
But it can be parsed either as "empty type" + query or as a type with one field (correct one).
Even negative tests will have the same problem, for example
""""""" # 7 double quotes
type Query
Parser can correct fail here:
"""""""
^
because block string is followed by quoting mark (correct one).
Or it can fail here:
"""""""
^
because parse doesn't support block strings at all and thinks it's string followed by another string.
In previous discussions, I advocated for just storing graphql-js AST for positive tests and errors in GraphQL format for negative tests.
So basically, I propose converting many existing graphql-js
tests into YAML files.
@solomon-b Do you think this proposal fits your usecase?
Regarding code location, I don't think it needs to be in the spec repo itself and there's a high bar for getting PRs into this repo. A separate "regression tests" repo could be updated and expanded with much less oversight allowing for faster iteration and more thorough coverage đ
Currently, we have 2 repos involved in the RFC process (graphql-spec and graphql-js) adding a third one will increase the coordination burden. Also, the reference parser should pass all the "regression tests" otherwise we create confusion so PRs to the "regression tests" repo would be blocked on "graphql-js" PRs.
Given that I suggest just having it in a top-level folder of graphql-js
.
@IvanGoncharov I think you hit on the core issue with this approach. We can provide test cases but we cannot provide valid graphql ASTs for someone else's project.
Providing a Graphql-js AST and GraphQL format would be better then nothing but would still require parser authors to write a serialization function to map their AST to JSON so that they can compare against our provided terms. There will also be more difficulty with asserting on our provided errors because the error messaging is unspecified.
Problems aside, I think adding these parse results would be a good idea.
Given that I suggest just having it in a top-level folder of graphql-js.
I don't have any opinion on where these files be stored so long as they are easy to find for anyone looking up the graphql spec.
type Query
# some text
{
foo: bar
}
Is it not the case that FieldsDefinition
must be a non-empty list making this have only one valid parse as a type with one field?
I don't think it's necessarily critical to use the ASTs for other projects - if they want to match the GraphQL.js AST then that's all well and good, but otherwise having the result of print(document)
match should be sufficient; e.g. indicate the input
and some kind of canonical
output:
input: >
"""Description"""
query A # comment
($foo:
Int!,,,,,) { __typename,,,}
canonical: >
"""Description"""
query A ($foo: Int!) {
__typename
}
ast: # ...
Regarding your question; the following are both parseable GraphQL documents:
type Query
and
# some text
{
foo: bar
}
The latter in this context is an ExecutableDocument
containing an anonymous operation, the former is an ObjectTypeDefinition
.
This is a parseable GraphQL document:
type Query
# some text
query {
foo: bar
}
It's an ObjectTypeDefinition
followed by an OperationDefinition
.
This however, omitting the optional query
keyword, is parsed as a completely different GraphQL document:
type Query
# some text
{
foo: bar
}
This is due to the negative-lookahead on ObjectTypeDefinition failing, and so what was previously an OperationDefinition is now instead a FieldsDefinition. So yes, the correct result is that it has only one valid parse, but the reason behind that is slightly different.
This is due to the negative-lookahead on ObjectTypeDefinition failing, and so what was previously an OperationDefinition is now instead a FieldsDefinition. So yes, the correct result is that it has only one valid parse, but the reason behind that is slightly different.
Thanks, these are precisely the types of subtleties I hope we can clarify through this project. i'm pretty busy with work stuff at this very moment, but I would like to put together a slightly more detailed proposal and then add it to an upcoming working group agenda. Even if I can't dive into the project right away, i would like to document the plan of attack so that we can come back to it quickly in the near future.
we have extensive collection of test cases in nim-graphql repo.
please take a look at this tests folder.
subfolder validation
targeting the parser and schema validator, while subfolder execution
targeting execution engine and introspection system.
validation\security.toml
is a collection of test cases found during fuzz testing.
in total we have 500+ individual test cases to prevent regression. what this collection lacks are scrutiny and audit.
Thank you @jangko. It would be really helpful to use those test cases in this project.
I've written an updated RFC which describes what we have all discussed. I am going to add this proposal to the agenda for August.
@benjie should I edit the original message here to my updated RFC or just throw it in a gist?
Updating this one is fine đ
TL;DR
It would be really nice to have a set of GraphQL Documents which collectively express the entire GraphQL grammar. This would include atypical constructions such as random insertions of comma characters as whitespace.
Problem Statement
The GraphQL query language is well specified with a formal grammar but is large enough that there can be subtle edge cases when constructing parsers, especially when the parser is hand written. The language also continues to be extended and library developers must stay on top of changes to the spec.
Proposed Solution
In addition to the language spec, we should providee an official collection of GraphQL Documents which covers the entire grammar constructed in varying levels of term complexity.
This set of `Golden Documents` should also contain examples which attempt to trip up the parser implementation. For example, inconsistent use of commas or extra commas, complexities around block strings, and ambiguous parses such as:
For every `Golden Document` representing a valid term we would provide a graphql-js AST. For those representing invalid terms we would provide the errors in GraphQL format.
Between the GraphQL Spec examples, the graphql-js test suite, and the nim-graphql test suite we have a large supply of test values. In addition I would also like to build a set of terms based on the chinook-database generated from the Hasura GraphQL Engine.
Such a collection of Documents would be extremely useful for golden testing GraphQL parsers and would help establish a higher standard for parsers across implementation languages.