gatsbyjs / gatsby-source-wordpress-experimental

The upcoming v4 of gatsby-source-wordpress, currently in beta
MIT License
385 stars 100 forks source link

gatsby develop runs out of memory when trying to pull in wordpress data, is there any way to limmit what it pulls in? #402

Closed meekmachine closed 3 years ago

meekmachine commented 3 years ago

I am attempting to use gatsby-wordpress-experimental connected to a wordpress site with several years worth content, aprox. 30,000 or so posts. The Wordpress site is running on shared hosting and we have no admin access to it. I was able to get it to work using the older plugins for Gatsby and Wordpress respectively, but after updating I now see this message:

   This is either your first build or the cache was cleared.
Please wait while your WordPress data is synced to your Gatsby cache.

Maybe now's a good time to get up and stretch? :D

Then the wordpress endpoint eventually runs out of memory even when I set the timeout and concurrency as suggested:

ERROR #gatsby-source-wordpress-experimental_111007

gatsby-source-wordpress Your wordpress server at http://panthernow.com/graphql appears to be overloaded.

Try reducing the requestConcurrency for content updates or the previewRequestConcurrency for previews:

{ resolve: 'gatsby-source-wordpress-experimental', options: { schema: { requestConcurrency: 5, // currently set to 1 previewRequestConcurrency: 2, // currently set to 2 } }, }

The GATSBY_CONCURRENT_REQUEST environment variable is no longer used to control concurrency. If you were previously using that, you'll need to use the settings above instead.

Since I am not able to allocate more memory to the Wordpress server, is there any way I could get around this issue either by populating the cache manually or somehow setting the initial cache populating query to filter date to limit the data so it will not run out of memory?

TylerBarnes commented 3 years ago

Hi @meekmachine , you can limit the total number of nodes that are fetched using this option https://github.com/gatsbyjs/gatsby-source-wordpress-experimental/blob/master/docs/plugin-options.md#type__all-object

You should also be able to check the current node env to have different limits based on the env.

{
  resolve: `gatsby-source-wordpress-experimental`,
  options: {
    type: {
      __all: {
        limit: process.env.NODE_ENV === `development` ? 500 : 10000,
      },
    },
  },
},
meekmachine commented 3 years ago

Hi @TylerBarnes, Thanks for your response! This seems to be working well limiting to 2000 nodes for each type. However, how can I load in all 50,000 or so posts from the old site?

TylerBarnes commented 3 years ago

@meekmachine what did you try setting those options to? You can also try lowering the amount of nodes fetched per request -> https://github.com/gatsbyjs/gatsby-source-wordpress-experimental/blob/master/docs/plugin-options.md#schemaperpage-int

{
  resolve: `gatsby-source-wordpress-experimental`,
  options: {
    schema: {
      perPage: 100, // lower this
    },
  },
},
meekmachine commented 3 years ago

@TylerBarnes Thanks I will try that. Are the request concurrency and preview request concurrency settings also helpful for this type of issue? Here is what I had currently, if some of this is not necessary or older code please let me know:


{
      resolve: `gatsby-source-wordpress-experimental`,
      options: {
        // the only required plugin option for WordPress is the GraphQL url.
        url:
          process.env.WPGRAPHQL_URL ||
          `http://****.com/graphql`,
          schema: {timeout: 2147483647},
          requestConcurrency: 5, // currently set to undefined
          previewRequestConcurrency: 2, // currently set to undefined
          baseUrl: `http://localhost:8888/GatsbyWP`,
          type: {
            __all: {
              limit: 2000,
            },
          },
      },
    },
TylerBarnes commented 3 years ago

Ah, looks like you've set the concurrency options almost in the right spot but not quite :) They should be inside the schema object. You can also remove the baseUrl setting in this plugin.

{
  resolve: `gatsby-source-wordpress-experimental`,
  options: {
    url: process.env.WPGRAPHQL_URL || `http://****.com/graphql`,
    schema: {
      timeout: 2147483647,
      requestConcurrency: 5,
      previewRequestConcurrency: 2,
    },
    type: {
      __all: {
        limit: 2000,
      },
    },
  },
}
meekmachine commented 3 years ago

Resolving this... may have more questions later.. Thank you for your help!

TylerBarnes commented 3 years ago

Glad I could help, reach out anytime!