wpengine / wp-graphql-content-blocks

Plugin that extends WPGraphQL to support querying (Gutenberg) Blocks as data
https://faustjs.org/docs/gutenberg/wp-graphql-content-blocks
GNU General Public License v2.0
104 stars 12 forks source link

Possible Performance Issues #134

Open jasonbahl opened 1 year ago

jasonbahl commented 1 year ago

It was reported in Slack that having WPGraphQL Content Blocks active at the same time as WPGraphQL Gutenberg leads to performance problems.

Steps to reproduce:

See degradation in server performance.

Slack conversation here for further context: https://wp-graphql.slack.com/archives/C3NM1M291/p1689789762004369

jasonbahl commented 1 year ago

My first step is going to be to try and reproduce this, even if un-scientifically. I plan to do a standard query, such as query for a list of posts.

Capture the baseline with just WPGraphQL active.

Then capture with just one plugin active, then the other, then both.

See if we can at least kind of reproduce this.

jasonbahl commented 1 year ago

Reproduction:

Just WPGraphQL

{
  posts {
    nodes {
      id
      title
      date
      content
    }
  }
}

I executed this 7 times and here's the results:

CleanShot 2023-07-21 at 11 40 19

WPGraphQL + WPGraphQL Gutenberg

With WPGraphQL and WPGraphQL for Gutenberg (v0.4.1) active, I first indexed the Gutenberg Blocks from the WPGraphQL for Gutenberg settings:

CleanShot 2023-07-21 at 11 42 48

Then I executed the same posts query as above (not querying for blocks at all)

CleanShot 2023-07-21 at 11 43 56

WPGraphQL + WPGraphQL Content Blocks

With WPGraphQL and WPGraphQL Content Blocks being the only active plugins:

CleanShot 2023-07-21 at 11 45 05

WPGraphQL + WPGraphQL Content Blocks + WPGraphQL Gutenberg

With all 3 plugins active, and still executing the same query for posts (not querying blocks)

CleanShot 2023-07-21 at 11 46 11


It does seem that there's a perf hit with either WPGraphQL Gutenberg or WPGraphQL Content Blocks, and even more when running both together.

ChrisWiegman commented 1 year ago

Thank you, @jasonbahl, while I doubt this will be a quick fix it is absolutely something we'll be looking at as soon as we can get to it.

blakewilson commented 1 year ago

Created a ticket to track this in our internal backlog. For internal users, it's available here:

https://wpengine.atlassian.net/browse/MERL-1130

ChrisWiegman commented 1 year ago

Thanks, Blake. We'll get that refined and, at the latest, in our next Sprint.

theodesp commented 1 year ago

We might need to add a performance testing pipeline to keep track of this. I suspect the reason behind this is calling WPGraphQL filters on interfaces which they might impact the code with hidden complexity.

theodesp commented 1 year ago

Namely: https://github.com/search?q=repo%3Awpengine%2Fwp-graphql-content-blocks%20register_graphql_interface_type&type=code

and https://github.com/search?q=repo%3Awpengine%2Fwp-graphql-content-blocks+register_graphql_interface&type=code

justlevine commented 1 year ago

We might need to add a performance testing pipeline to keep track of this.

Been looking into this for WPGraphQL and my own work. I suggest taking a look at this repo, which is building off the work by the core performance team: https://github.com/swissspidy/compare-wp-performance

I suspect the reason behind this is calling WPGraphQL filters on interfaces which they might impact the code with hidden complexity.

My own money is on the use of eagerlyLoadType (which I'm guessing is related to https://github.com/wp-graphql/wp-graphql/issues/2598).

jasonbahl commented 1 year ago

My own money is on the use of eagerlyLoadType (which I'm guessing is related to https://github.com/wp-graphql/wp-graphql/issues/2598).

Ya, this is likely.

My current thinking is that we can maybe disable the eagerlyLoadType functionality for non-introspection requests.

Would be good to "prove" that this is the culprit with a reproduction unassociated with the direct wp-graphql-content-types plugin code. (or even maybe easier, see if it is solved by setting "eagerlyLoadType => false")

If we're able to confirm this is the culprit, then I think we should explore options for allowing types to eagerlyLoad for introspection (GraphiQL and other tooling needs) but not necessarily for run-time of queries. 🤔

theodesp commented 1 year ago

We could also try to remove any calls to register_graphql_interfaces_to_types since it may run linearly on each block so it would clog the pipeline. We should try instead use the get_block_interfaces.

especially with the Anchor support:

https://github.com/wpengine/wp-graphql-content-blocks/blob/7789be55aca5de807d4b9c48b993388a73900429/includes/Field/BlockSupports/Anchor.php#L51-L52

jasonbahl commented 1 year ago

WPGraphQL + WPGraphQL Content Blocks active

With just WPGraphQL and WPGraphQL active, if I modify the code in WPGraphQL Content Blocks and change all instances of eagerlyLoadType => true to false and re-execute the queries, I get the following results:

CleanShot 2023-07-24 at 12 57 08

This is promising.

The tradeoff, however, is that individual Blocks are no longer visible in GraphiQL IDE. This is why I believe we might be able to detect if a query is an introspection query and if eagerlyLoadType => true we respect the true, else we let it remain false.

This would allow introspection queries for tools like GraphiQL to get the Types that aren't directly resolved via a field (individual Block objects, for example) while keeping run-time performance for other queries higher.

Not 💯 sure this would work without other negative side-effects, but worth exploring I think.

Another thing I think we could explore is allowing fields on types to be callbacks themselves.

i.e. instead of registering a Type like so:

register_graphql_object_type( 'MyType', [
  'description' => __( 'My Type description that will be translated', 'text-domain' ),
  'fields' => [
    'fieldOne' => [ 
      'type' => 'String',
      'description' => __( 'Another thing that will be translated', 'wp-graphql' ),
    ],
  ],
] );

We might be able to allow types to be registered like so:

register_graphql_object_type( 'MyType', 
   // Instead of a config array, we return a callback that will execute when the Type is instantiated
  function() {
    return [
      'description' => __( 'My Type description that will be translated', 'text-domain' ),
      'fields' => [
        // Instead of a config array we provide a callback that will execute if/when the specific field is used in a request 
        'fieldOne' => function() { 
          return [ 
            'type' => 'String',
            'description' => __( 'Another thing that will be translated', 'wp-graphql' ),
          ];
        },
      ]
    ];
  }
);

This proposal most likely would require some updates to core WPGraphQL, but would likely provide significant improvements as it would lead to a lot of code executing only when actually needed.

Also, this is just a theory, but I believe it would have impact and might help with other things like internationalization that have reported other similar performance issues.

jasonbahl commented 1 year ago

possibly related: https://github.com/wp-graphql/wp-graphql/issues/2721

justlevine commented 1 year ago

@jasonbahl an alternative to https://github.com/wp-graphql/wp-graphql/issues/2598 may possibly be cache some of the steps in a transient so it doesn't need to be rebuilt on every request ... I cant find an existing ticket, but I remember a recent discussion about it on Slack.

jasonbahl commented 1 year ago

@justlevine caching is easy. Cache invalidation is the hard part. What would the cache invalidation strategy of the Schema look like? The Schema could change via direct code calls to hooks/filters/registergraphql* functions, post types being registered in code or via CPT UI etc, fields added by plugins like ACF / ACM / MetaBox.io, plugins like Pods, etc.

If someone has really thought through how to go about caching the schema and more importantly invalidating the Schema, I'm interested in entertaining it, but I'm not seeing a path here. It would be like trying to cache the callbacks of "do_action" or "apply_filters".

I think the real problem, or at least my theory of it, is that we're causing a lot more things to execute than needed. For example, field and Type descriptions, etc. Those are only actually needed for Introspection, but I believe we're still executing the translation functions even when the field(s) aren't needed for any given query.

justlevine commented 1 year ago

@jasonbahl agree with the concerns, but those are all engineering problems. Don't want to get too much into the weeds here for something that's solely relevant upstream, but essentially 1) we can chunk different parts of the schema generation process so the whole thing doesn't need to be invalidated. 2) we can hash the pre-resolved configs so any changes self-invalidate. 3) WPGraphQL Smart Cache already introduces the concept that the server might be sending stale data that needs to be invalidated manually. Combined with good defaults, a constant (inherited from GRAPHQL_DEBUG) for disabling on dev environments, and a manual invalidation mechanism, we really should be fine.

I think the real problem, or at least my theory of it, is that we're causing a lot more things to execute than needed.

💯. Caching is an additive strategy for the same goal of preventing unnecessary calls in the first place .

josephfusco commented 1 year ago

Hey all, there doesn't seem to be anything actionable at the moment but we are tracking this conversation as this is important to us.

jasonbahl commented 1 year ago

Hey all, there doesn't seem to be anything actionable at the moment

@josephfusco as mentioned here

Would be good to "prove" that this is the culprit with a reproduction unassociated with the direct wp-graphql-content-types plugin code. (or even maybe easier, see if it is solved by setting "eagerlyLoadType => false")

Currently we know that there is performance degradation when activating this plugin. The action now is to identify if it's something this plugin is doing specifically or not.

The action is to see if any plugin that uses eagerlyLoadType has the same performance problems or just this plugin? Is there something else in this plugin that can be fixed?

justlevine commented 1 year ago

Currently we know that there is performance degradation when activating this plugin. The action now is to identify if it's something this plugin is doing specifically or not.

The action is to see if any plugin that uses eagerlyLoadType has the same performance problems or just this plugin? Is there something else in this plugin that can be fixed?

Or even a regular ol PHP profile run to see if any of the wp-graphql-content-blocks-specific calls are overly heavy, even if its a 1-time audit and not the actual pipeline (yet) that @theodesp suggested( https://github.com/wpengine/wp-graphql-content-blocks/issues/134#issuecomment-1647631128)

mindctrl commented 1 year ago

Clarification: we have an issue in the next sprint (starts next week) to look at this. We can test the ideas here and report our findings.

jasonbahl commented 1 year ago

We could also try to remove any calls to register_graphql_interfaces_to_types since it may run linearly on each block so it would clog the pipeline. We should try instead use the get_block_interfaces.

@theodesp yes, this makes sense to me.

the intent of register_graphql_interfaces_to_types() is more for codebases extending Types they themselves do not have control over.

For example, a plugin that wants to add an interface to the core Post type, or similar.

In this case, since WPGraphQL Content Blocks is the same codebase registering blocks and the interface, it makes more sense (to me at least) to define the interfaces the block applies at the time the block itself is being registered

mindctrl commented 1 year ago

Did some quick tests of vanilla site, using same query above.

{
  posts {
    nodes {
      id
      title
      date
      content
    }
  }
}
WPGraphQL only
65ms
57ms
65ms
53ms
60ms

WPGraphQL + WPGraphQL Content Blocks
96ms
92ms
100ms
97ms
92ms

WPGraphQL + WPGraphQL Content Blocks (eagerlyLoadType false)
95ms
76ms
86ms
58ms
73ms
77ms
66ms
79ms
68ms
66ms
66ms

WPGraphQL + WPGraphQL Content Blocks (eagerlyLoadType false) + WPGraphQL Gutenberg
113ms
96ms
95ms
106ms
99ms
97ms

WPGraphQL + WPGraphQL Content Blocks (eagerlyLoadType false) + WPGraphQL Gutenberg (eagerlyLoadType false)
93ms
81ms
80ms
86ms
78ms
theodesp commented 1 year ago

We could also try to remove any calls to register_graphql_interfaces_to_types since it may run linearly on each block so it would clog the pipeline. We should try instead use the get_block_interfaces.

@theodesp yes, this makes sense to me.

the intent of register_graphql_interfaces_to_types() is more for codebases extending Types they themselves do not have control over.

For example, a plugin that wants to add an interface to the core Post type, or similar.

In this case, since WPGraphQL Content Blocks is the same codebase registering blocks and the interface, it makes more sense (to me at least) to define the interfaces the block applies at the time the block itself is being registered

Will create a jira ticket for this.

jeromecovington commented 10 months ago

Is this performance issue still noticeable? I am hoping to chart a migration path away from WPGraphQL Gutenberg but I will need both plugins activated as I cutover each of my queries.

theodesp commented 10 months ago

Hey @jeromecovington. I did some Load testing in a ticket last time.

https://github.com/wpengine/wp-graphql-content-blocks/pull/146

There are some performance improvements since the last update at about 15% more requests/s processed.

There is a performance penalty (about 15%) when we use the eagerlyLoadType: true when registering some types.

If we set this to false here

So when we do that we get back 15% more requests/s processed. However we do not see the GraphQL types in the documentation explorer.

This is a drawback and I think this is an improvement that needs to be done from WPGraphQL side.

In general terms the more blocks are available into the system then more work is needed to register and declare them. So if you have 300 blocks available then every time a request comes it has to go through the process.

Also AFAIK having both WPGraphQL Gutenberg and Content Blocks are are doing the work twice so you are registering the same blocks and block attributes two times for each block thus you may see performance issues.