Slow introspection query execution for large schemas

lonerz commented 4 years ago

The graphql-core graphql_sync function takes approximately 20x slower than a module that I wrote that solely executes the introspection query. https://github.com/kensho-technologies/graphql-compiler/blob/main/graphql_compiler/fast_introspection.py

With the graphql-core Python library, executing the default introspection query (outputted from get_introspection_query) on a schema of around ten thousand types with graphql_sync takes 26 seconds to run. Writing my own module that calls the same graphql-core Type resolvers executes the same query on the same schema in 1.37 seconds.

I'm not sure what exactly is the time sink with the graphql-core's approach, but it might have to do with the generality of graphql_sync and thus, computing the next field to resolve at every step, and also graphql_sync's recursive nature. Interestingly, I ran the same query on the same schema using Graphql.js, and the execution took less than 2 seconds (around 1.7 seconds).

Thank you for your time and all the work to port GraphQL.js to Python!

Cito commented 4 years ago

@lonerz Thanks for the feedback. Did you try this with the latest version of GraphQL-core? Does your repo contain a benchmark that we can use as a starting point when working on this issue?

lonerz commented 4 years ago

@Cito thanks for the quick reply! Yes, we are using the latest version of GraphQL-core. We can't share our schema explicitly, but I wrote a script that creates a schema with 5000 types that have 100 int fields each: https://gist.github.com/lonerz/034acc29080d057b7a990a22732aafb4

My module takes 2.7 seconds to introspect this schema whereas GraphQL-core takes 40 seconds.

Cito commented 4 years ago

@lonerz The following function can be used to build your test schema:

def make_schema(n_types=5000, n_fields=100):
    return GraphQLSchema(GraphQLObjectType('Query', {
        f'type{i}': GraphQLList(GraphQLObjectType(f'Type{i}', {
            f'field{j}': GraphQLField(GraphQLInt) for j in range(n_fields)}))
        for i in range(n_types)}))

Introspecting this schema takes about 25s on my computer, while introspecting the same schema with GraphQL.js takes only about 3.3s. This is very similar to what you measured. Interestingly, when I run the same code with PyPy 3.6, it only takes about 4s. This is faster than the average speedup you would expect when using PyPy.

I also did some profiling today, but unfortunately, could not find an obvious bottleneck. The cycles seem to be wasted in the nested, recursive calls in the execute module.

Maybe the JavaScript engine can optimize this kind of code much better (some tail call elimination that happens only in JS? but this would not explain why PyPy is so much faster because it is similar to CPython in that regard). For comparison, I also ran the code with node --no-opt, which took 13s. So some heavy optimizaiton is going on here, but even unoptimized it is still faster than CPython.

I'm currently lacking the time, but will leave this open for further investigation. It would be great if others could look into this as well - maybe I'm overlooking something obvious.

Cito commented 4 years ago

Btw, introspection performance can also be measured as follows. This uses the github schema which is a more realistic example:

pytest --benchmark-enable tests/benchmarks/test_introspection_from_schema.py

lonerz commented 4 years ago

Interesting that the speedup is that large with PyPy 3.6. I myself profiled the GraphQL execution code, but also couldn't come up with any obvious bottlenecks. Hopefully others can look at this and thanks @Cito for your time digging into this!

Cito commented 2 years ago

See also #142 for performance optimization.

graphql-python / graphql-core

Slow introspection query execution for large schemas #101