n1ru4l / envelop

Envelop is a lightweight library allowing developers to easily develop, share, collaborate and extend their GraphQL execution layer. Envelop is the missing GraphQL plugin system.
https://envelop.dev
MIT License
792 stars 129 forks source link

Create a more performant alternative to `onResolverCalled` #1248

Open hayes opened 2 years ago

hayes commented 2 years ago

Is your feature request related to a problem? Please describe.

Currently, any plugin that uses onResolverCalled is probably incurring significantly more performance overhead than is necessary. Many plugins that need to augment resolver behavior need to do this conditionally (on some subset of fields), and even those that wrap every field (like the various tracing plugins) probably should have a way to filter the fields they are wrapping/tracing.

The current approach introduces potentially hundreds of additional awaits into simple synchronous resolvers per requests. Promises and async/await are very efficient, but in testing I did in Pothos, introducing async wrappers around every resolver in conjunction with async_hooks can easily reduce throughput under load by an order of magnitude.

Describe the solution you'd like For many plugins a lot of logic can be moved from resolve time to an earlier point when the schema is being built.

Adding a new hook like onResolverDefinition or something similar that is passed the field definition and returns a OnResolverCalledHook would allow plugins to conditionally add hooks on a per-field basis. This could be added without removing the existing onResolverCalled, with OnResolverCalledHooks being merged when the field is configured and plugins could be migrated to the new API as needed.

This has the additional benefit of pre-comuting values based on directives or extensions on the field to create a wrapper specific to that field. Eg, a validation plugin could create a validator ahead off time, and only wrap the field if it has a validation directive.

In addition to providing a method to conditionally wrap resolvers, the field wrapper created probably should not be an async function. Writing in a style where you use non-promise results directly rather than always awaiting return values is pretty cumbersome, but for a core library the performance gains are probably worth it (This might be an issue to raise separately)

Describe alternatives you've considered

Everything above can also be achieved by side-stepping the onResolverCalled hook and just mutating the schema directly, but this kinda defeats the purpose of having the awesome plugin system that envelop provides.

Additional context

n1ru4l commented 2 years ago

The alternative would actually be https://www.graphql-tools.com/docs/schema-directives#full-mapschema-api

This raises a new concern for me on whether transforming the schema should be the job of envelop at all 🤔 .

E.g. if you want to alter schema behavior that could be achieved via the plugin depending on @graphql-tools/utils.

import { Plugin } from '@envelop/core';
import { defaultFieldResolver } from 'graphql';
import { mapSchema, MapperKind } from '@graphql-tools/utils';

const plugin: Plugin = {
    onSchemaChange({ schema, replaceSchema }) {
      replaceSchema(
        mapSchema(schema, {
          [MapperKind.OBJECT_FIELD]: (field, fieldname, typename) => {
            if (
              field.astNode?.directives?.some(directive => directive.name.value === 'requireAuth') ||
              field.extensions['requireAuth']
            ) {
              return {
                ...field,
                resolve: (...args) => {
                  runAuthLogic()
                  return (field.resolve ?? defaultFieldResolver)(...args);
                },
              };
            }
          },
        })
      );
    },
}
hayes commented 2 years ago

That seems like a decent alternative to me.

In my opinion, the upside of having something like envelop is that it has the potential to standardize, abstract, and enforce best practices and conventions for these kinds of patterns. I don't have a strong opinion on if this should be something envelop supports, but providing hooks that let you safely run hooks before and after resolves execute gives you the opportunity to do things like correctly handle promise or iterator returning revolvers in a more efficient way, without having to duplicate that logic in each plugin, or enforce a filtering function that encourages plugin authors to consider which field they want to wrap. This logic could also be abstracted away in another package, and might not be sufficiently complex to need it in the first place.

Using something like mapSchema gives maximum control to plugins, and would produce the lowest runtime overhead for well written plugins.

The more I think about it, the more I like your alternative, even if it makes writing plugins slightly more complex. If we find anything sufficiently complex that is needed by multiple plugins, we can always provide utility functions for wrapping revolvers with a set of hooks that you apply during the mapping.

hayes commented 2 years ago

If this is a direction that you want to go, I am happy to help update plugins to use this convention

n1ru4l commented 2 years ago

@hayes We would appreciate your help. Let's move all plugins to use mapSchema and start deprecating onResolverCalled.

hayes commented 2 years ago

@n1ru4l Cool, happy to work on this.

I started experimenting with this here: https://github.com/hayes/envelop/pull/1/files

Some things I noticed:

If we want to use mapSchema, we would need to add that as a dependency of envelope, since the core timing plugin currently uses onResolverCalled. I really like libraries that do not have external dependencies, adding something like that to envelop core seems unfortunate.

I was did some experimentation with some resolver wrapping that does not use async functions, and preserves synchronous resolver execution when both plugin wrappers, and the resolvers are synchronous. This logic feels too complex to re-implement in every plugin, and the existing onResolverCalled hook format provides a decent API for wrapping resolvers. I wrote a utility function that uses the same hook shape that could be used with mapSchema, but not sure if something like that would still belong in envelop, or where it would go.

I did some benchmarking, and found that using using the wrapping style linked above results ~60-100% performance improvement depending on if resolvers and plugins are async or not. In all cases there is some improvement, but in simple synchronous cases, improvement was often >100%. JS benchmarks are often not that valuable, and don't always correlate well with real world performance 🤷. That being said, I think preserving synchronous execution when possible is worth while

n1ru4l commented 2 years ago

I prefer keeping @envelop/core dependency-free. The plugins that want to wrap resolvers should have a dependency upon @graphql-tools/utils.

I agree that we can improve the performance of resolver wrapping, however, the better default is to avoid it whenever possible.

hayes commented 2 years ago

I'm not sure what that means for the use-timing plugin since its part of core.

I see 4 paths forward:

  1. Remove plugins that depend on wrapping resolvers from core
  2. Write a simple custom schema mapping function that does not depend on an external dep that maps schemas for wrapping resolvers
  3. Add @graphql-tools/utils as a dep in core
  4. Keep using the current resolver wrapping logic for the use-timing plugin

I think 3 is a bad option, but not sure which of the other options makes the most sense.

prepareTracedSchema is effectively already doing that mapping (although it mutates rather than copies the schema)

I think writing a custom mapping solution would probably be the least bad option

n1ru4l commented 2 years ago

As a first step I would probably remove the onResolverCalled measurement from useTiming and introduce a new standalone plugin that does this? I guess the apollo-tracing plugin already does this. 🤔

hayes commented 2 years ago

@n1ru4l Are you against the idea of creating a way to map revolvers that does not depend on an external dependency? I am hesitant to introduce more breaking changes than necessary.

Even if we do decide to remove resolver timing from the core useTiming plugin, I am still hesitant about both re-implementing resolver wrapping logic in each plugin that currently uses onResolverCalled and adding a dependency on graphql-tools/util to those packages.

n1ru4l commented 2 years ago

The example I shared above can cause an infinite loop if you have more than one plugin that is calling replaceSchema within onSchemaChange.

I noticed that while migrating the redwood useRedwoodDirective plugin over here: https://github.com/redwoodjs/redwood/pull/4760

export const useExampleSchemaTransformPlugin = (): Plugin => {
  /**
   * This symbol is added to the schema extensions for checking whether the transform got already applied.
   */
  const didMapSchemaSymbol = Symbol('useExampleSchemaTransformPlugin.didMapSchemaSymbol')
  return {
    onSchemaChange({ schema, replaceSchema }) {
      /**
       * Currently graphql-js extensions typings are limited to string keys.
       * We are using symbols as each useExampleSchemaTransformPlugin plugin instance should use its own unique identifier.
       */
      // eslint-disable-next-line @typescript-eslint/ban-ts-comment
      // @ts-ignore
      if (schema.extensions?.[didMapSchemaSymbol] === true) {
        return
      }
      const transformedSchema = mapSchema(schema, ....)
      // eslint-disable-next-line @typescript-eslint/ban-ts-comment
      // @ts-ignore
      transformedSchema.extensions = {
        ...schema.extensions,
        [didMapSchemaSymbol]: true,
      }
      replaceSchema(transformedSchema)
    },
  }
}

Internally we also have been talking about making envelop completely schema unaware, which would be a bigger change, but make envelop less schema-focused. We are still not 100% sure whether this would work out and the other implications this would have.

hayes commented 2 years ago

I ended up writing https://pothos-graphql.dev/docs/plugins/tracing because I was not clear on a path forward here.

The more I think about it though, I like the separation between using envelope for tracing various phases of execution, but not messing with resolvers directly, and letting pothos (or other schema specific tools) handle wrapping resolvers.

For my use case, I really wanted to have fine grained control over which fields were wrapped/traced, and have the ability to modify behavior along side each field.

I'm not sure if this would make sense in a lot of other apps though, having to set up tracing in several different libraries probably isn't desirable for most people 🤷.

FWIW I like the idea of making envelop less schema aware