wp-graphql / wp-graphql

:rocket: GraphQL API for WordPress
https://www.wpgraphql.com
GNU General Public License v3.0
3.67k stars 446 forks source link

v1.8.0 causing gatsby-source-wordpress error #2343

Closed pauljavascripting closed 2 years ago

pauljavascripting commented 2 years ago

v1.80 was preventing my headless Wordpress site from building successfully with the following error:

gatsby-source-wordpress Found a duplicate ID in WordPress - this means you will have fewer nodes in Gatsby than in WordPress. This will need to be resolved in WP by identifying and fixing the underlying bug with your WP plugins or custom code.

I reverted to v1.72 and the error disappeared.

Hope this helps someone out!

justlevine commented 2 years ago

@pauljavascripting there's not really a lot to go on here. There's been some reports that v1.8.0 is catching invalid GraphQL names that were sneaking by undetected in previous versions, but I havnt seen/can't replicate a tax error like what you're report.

To properly debug:

  1. Deactivate all plugins and custom code except for WPGraphQL and WPGatsby, and confirm gatsby-source-wordpress can introspect.
  2. Enable your plugins / custom WPGraphQL functions one-by-one, until you get the error back.
jasonbahl commented 2 years ago

Hello, wpgraphql.com is running WPGraphQL v1.8.0 and WPGatsby v2.3.2 and I'm able to successfully build my site on Gatsby.

I'd like to understand more about this regression, as it seems like it's affecting some other folks in the WPGraphQL Slack channel as well (see: https://wp-graphql.slack.com/archives/C3NM1M291/p1650019084899219)

I'm not currently able to reproduce this based on my use of WPGraphQL + Gatsby.

If someone could provide some steps to reproduce, that would be helpful.

A Gatsby Repo + access to a staging site for the WordPress install that's experiencing the issue would be great.

I have a hunch that it might be related to either certain data sets, or a certain combination of plugins (like WPGraphQL for ACF, or WPGraphQL for WooCommerce, or something to that tune).

Would be good to be able to reproduce so we can troubleshoot further.

tsdexter commented 2 years ago

I am also having similar issues:

warn  gatsby-source-wordpress  Found a duplicate ID in WordPress - this means
 you will have fewer nodes in Gatsby than in WordPress. This will need to be
 resolved in WP by identifying and fixing the underlying bug with your WP plugins
  or custom code.

 info  gatsby-source-wordpress  #524 (/tag/centre-for-social-entrepreneurship/)
 is a duplicate of 524 (/tag/centre-for-social-entrepreneurship/)

 info  gatsby-source-wordpress  #886 (/tag/centre-for-social-entrepreneurship-4/)
  is a duplicate of 886 (/tag/centre-for-social-entrepreneurship-4/)

Odd that it's saying it's a duplicate of itself.

Probably not very helpful, but looking at the data, there are definitely no duplicate IDs though there are very similar slugs and duplicate names on those records:

image
jasonbahl commented 2 years ago

@tsdexter would you be able to provide me with a way to reproduce this? Is your endpoint public? Or you can join the WPGraphQL Slack and DM me.

I believe there's a legit issue here, but I've not been able to personally reproduce this yet.

I have a hunch that it's related to some specific set / combination of data. I'd love to get to a point where I could reproduce the bug so I could investigate it further.

Let me know if you could maybe get me access to a staging site or an export of your WordPress data, etc.

jasonbahl commented 2 years ago

@tsdexter if access to a staging site or an export isn't practical, possibly you could explore some of the debug options in Gatsby Source WordPress and see if you can pin down the specific query(s) that are causing the bug. https://github.com/gatsbyjs/gatsby/blob/master/packages/gatsby-source-wordpress/docs/plugin-options.md#debuggraphqlshowqueryonerror

tsdexter commented 2 years ago

@jasonbahl I should be able to get you into one of our separated staging sites... let me make sure it's reproducing there and then I'll DM you

tsdexter commented 2 years ago

@jasonbahl in the meantime, not sure if it's the same for the other reporter but the duplicates for me are only in Tag type... excluding the type in the source plugin options stops the issue

do you have many tags in your repro?

justlevine commented 2 years ago

This is the function in gatsby-source-wordpress that throws the error. Nothing is jumping out at me, but perhaps its compensating for #2293 somehow (i.e. manually restoring the missing terms, which post-fix causes the same post-type to be added a second time )?

jasonbahl commented 2 years ago

@tsdexter do you have more than 100 tags on the site that's having problems?

jasonbahl commented 2 years ago

I only have a handful of tags on my site, so perhaps the way to reproduce is to have more than 100 tags so Gatsby paginates them. Or I can set the pagination limit in Gatsby to lower as well.

I'll look into reproducing this way.

jasonbahl commented 2 years ago

ok, I think I might have reproduced now.

I created 11 tags named (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11) and ran the following query:

{
  tags {
    pageInfo {
      hasNextPage
      endCursor
    }
    nodes {
      id
      name
    }
  }
}

I would expect to get back 10 tags and pageInfo.hasNextPage should be true

Instead I got the following payload back:

{
  "data": {
    "tags": {
      "pageInfo": {
        "hasNextPage": false,
        "endCursor": "YXJyYXljb25uZWN0aW9uOjM0OA=="
      },
      "nodes": [
        {
          "id": "dGVybTozNDE=",
          "name": "1"
        },
        {
          "id": "dGVybTozNTA=",
          "name": "10"
        },
        {
          "id": "dGVybTozNTE=",
          "name": "11"
        },
        {
          "id": "dGVybTozNDI=",
          "name": "2"
        },
        {
          "id": "dGVybTozNDM=",
          "name": "3"
        },
        {
          "id": "dGVybTozNDQ=",
          "name": "4"
        },
        {
          "id": "dGVybTozNDU=",
          "name": "5"
        },
        {
          "id": "dGVybTozNDY=",
          "name": "6"
        },
        {
          "id": "dGVybTozNDc=",
          "name": "7"
        },
        {
          "id": "dGVybTozNDg=",
          "name": "8"
        }
      ]
    }
  }
}

Tag 9 is missing, and hasNextPage is false.

I believe @justlevine is correct in identifying which PR caused the regression. Hope to have a fix soon!

Thanks @tsdexter for the added information to help us nail this down πŸ™πŸ»

tsdexter commented 2 years ago

No problem. I couldn't get the repro working on staging, so I'm glad you figured it out.

Yes, I do have 1300+ tags on the site with the issue.

jasonbahl commented 2 years ago

@tsdexter Ok, false alarm. Still not able to properly reproduce.

The query I tried was showing false for hasNextPage per the Relay spec. If first or last is not defined, hasNextPage should be false.

CleanShot 2022-04-22 at 13 43 47

When I add first to the query, I get hasNextPage: true as expected.

CleanShot 2022-04-22 at 13 43 36

Gatsby makes paginated queries using the first/last after/before arguments, so I don't believe what I thought I reproduced is actually the issue.

jasonbahl commented 2 years ago

@pauljavascripting would it be possible to get me access to a staging site? I'm not able to reproduce this still. I thought I had a lead, but (see last comment) it was a false lead.

One thing you might want to do is make sure to clear the Gatsby Cache and do a full rebuild to see if that makes a difference?

pauljavascripting commented 2 years ago

Hello Jason, sorry for the slow reply! I’d love to give you access but I’m not sure my tech client would agree! Would a list of plugins help? I can confirm that I don’t have Woocommerce installed though. I was also able to resolve this issue by reverting to v1.72 of WPGraphQL! I’d also like to add that I think WpGraphQL is brilliant. Thank you!!

justlevine commented 2 years ago

@pauljavascripting @tsdexter can you confirm whether you get the correct nodes in non-gatsby WPGraphiQL?

jasonbahl commented 2 years ago

@pauljavascripting a list of plugins would be a great start. Talking with someone in Slack, we think there might be a conflict with some custom ordering plugins, like: https://wordpress.org/plugins/post-types-order/

Would be good to narrow down and see if we can figure things out.

pauljavascripting commented 2 years ago

Hey Jason - here's a list of the activated plugins:

Add WPGraphQL SEO Advanced Custom Fields Advanced Custom Fields: Extended ACF Extended Classic Editor Custom Post Type UI JAMstack Deployments WP Gatsby WP GraphQL WPGraphQL for Advanced Custom Fields Yoast SEO

jasonbahl commented 2 years ago

@pauljavascripting thanks!

jasonbahl commented 2 years ago

I was also able to resolve this issue by reverting to v1.72 of WPGraphQL

@pauljavascripting if you update to WPGraphQL v1.8 again and do a gatsby clean and a fresh, uncached build, does it still fail?

Hey Jason - here's a list of the activated plugins:

@pauljavascripting I've tried to reproduce this by taking the following steps:

Setup WordPress

I setup a local WordPress install with the following plugins installed and activated:

+-------------------------+--------+-----------+---------+
| name                    | status | update    | version |
+-------------------------+--------+-----------+---------+
| add-wpgraphql-seo       | active | none      | 4.17.0  |
| acf-extended            | active | none      | 0.8.8.7 |
| acf-pro                 | active | available | 5.11    |
| classic-editor          | active | none      | 1.6.2   |
| custom-post-type-ui     | active | none      | 1.11.2  |
| wp-jamstack-deployments | active | none      | 1.1.1   |
| wordpress-importer      | active | none      | 0.7     |
| wp-gatsby               | active | none      | 2.3.2   |
| wp-graphql              | active | none      | 1.8.0   |
| wp-graphql-acf-origin   | active | none      | 0.5.3   |
| wordpress-seo           | active | none      | 18.6    |
+-------------------------+--------+-----------+---------+

I then imported dummy WordPress content from wptest.io

Gatsby Setup

I created a new Gatsby site by running npm gatsby init and walking through the wizard to connect it to my new WPGraphQL install.

Here's my gatsby.config.js

module.exports = {
  siteMetadata: {
    title: `Issue 2343`,
    siteUrl: `https://www.yourdomain.tld`
  },
  plugins: [{
    resolve: 'gatsby-source-wordpress',
    options: {
      "url": "http://issue2343.local/graphql"
    }
  }, {
    resolve: 'gatsby-plugin-manifest',
    options: {
      "icon": "src/images/icon.png"
    }
  }]
};

Run Gatsby

I then started gatsby with the gatsby develop command

This consumed all the content from my new site:

➜  gatsby git:(master) gatsby develop
success compile gatsby files - 0.739s
success load gatsby config - 0.035s
success load plugins - 1.462s
success onPreInit - 0.003s
success initialize cache - 0.224s
success copy gatsby files - 0.136s
success Compiling Gatsby Functions - 0.250s
success onPreBootstrap - 0.263s
success  gatsby-source-wordpress  ensuring plugin requirements are met - 2.157s
β €
info  gatsby-source-wordpress 

    This is either your first build or the cache was cleared.
    Please wait while your WordPress data is synced to your Gatsby cache.

    Maybe now's a good time to get up and stretch? :D

success  gatsby-source-wordpress  ingest WPGraphQL schema - 2.938s
success createSchemaCustomization - 5.129s
success  gatsby-source-wordpress  fetch root fields - 0.534s
success  gatsby-source-wordpress  ContentType - 1.058s - fetched 3
success  gatsby-source-wordpress  Comment - 1.807s - fetched 31
success  gatsby-source-wordpress  MenuItem - 2.427s - fetched 0
success  gatsby-source-wordpress  Category - 2.557s - fetched 42
success  gatsby-source-wordpress  Menu - 2.764s - fetched 0
success  gatsby-source-wordpress  PostFormat - 3.576s - fetched 9
success  gatsby-source-wordpress  Page - 3.810s - fetched 16
success  gatsby-source-wordpress  Tag - 4.903s - fetched 16
success  gatsby-source-wordpress  Taxonomy - 5.299s - fetched 3
success  gatsby-source-wordpress  UserRole - 5.638s - fetched 0
success  gatsby-source-wordpress  User - 6.538s - fetched 7
success  gatsby-source-wordpress  Post - 6.879s - fetched 35
success  gatsby-source-wordpress  MediaItem - 12.458s - fetched 40
success  gatsby-source-wordpress  creating nodes - 12.463s
success  gatsby-source-wordpress  fetching nodes - 19.394s - 202 total
success Downloading remote files - 11.274s - 40/40 3.55/s
success Checking for changed pages - 0.001s
success source and transform nodes - 19.503s
success building schema - 0.611s
success createPages - 0.001s
success createPagesStatefully - 0.092s
info Total nodes: 276, SitePage nodes: 4 (use --verbose for breakdown)
success Checking for changed pages - 0.001s
success write out redirect data - 0.002s
success Build manifest and related icons - 0.145s
success onPostBootstrap - 0.149s
info bootstrap finished - 31.909s
success onPreExtractQueries - 0.001s
success extract queries from components - 2.043s
success write out requires - 0.005s
success run page queries - 0.016s - 3/3 186.25/s
β €
You can now view issue-2343 in the browser.
β €
  http://localhost:8000/
β €
View GraphiQL, an in-browser IDE, to explore your site's data and schema
β €
  http://localhost:8000/___graphql
β €
Note that the development build is not optimized.
To create a production build, use gatsby build

Gatsby successfully consumed all the content from WPGraphQL without issue.

Forcing Pagination in Gatsby

One theory I had was that there was a bug with paginated requests and Gatsby was having issues with pagination. Since I only have 16 tags and 42 categories, Gatsby would get them all in one request, as Gatsby asks for 100 items by default.

So, I updated my Gatsby Config like so:

  plugins: [{
    resolve: 'gatsby-source-wordpress',
    options: {
      "url": "http://issue2343.local/graphql",
      schema: {
        perPage: 5,
      }
    }
  }

Now, Gatsby will ask for 5 items per request and will make paginated requests to fetch the remaining nodes. Surely, if there's a bug with pagination, this should help reproduce it.

I then ran gatsby clean && gatsby develop

➜  gatsby git:(master) βœ— gatsby clean && gatsby develop
info Deleting .cache, public, /Users/jason.bahl/Sites/issue2343-gatsby/gatsby/no
de_modules/.cache/babel-loader, /Users/jason.bahl/Sites/issue2343-gatsby/gatsby/
node_modules/.cache/terser-webpack-plugin
info Successfully deleted directories

success compile gatsby files - 0.435s
success load gatsby config - 0.034s
success load plugins - 1.149s
success onPreInit - 0.005s
success initialize cache - 0.129s
success copy gatsby files - 0.148s
success Compiling Gatsby Functions - 0.259s
success onPreBootstrap - 0.273s
success  gatsby-source-wordpress  ensuring plugin requirements are met - 2.692s
β €
info  gatsby-source-wordpress 

    This is either your first build or the cache was cleared.
    Please wait while your WordPress data is synced to your Gatsby cache.

    Maybe now's a good time to get up and stretch? :D

success  gatsby-source-wordpress  ingest WPGraphQL schema - 3.762s
success createSchemaCustomization - 6.490s
success  gatsby-source-wordpress  fetch root fields - 0.655s
success  gatsby-source-wordpress  ContentType - 1.346s - fetched 3
success  gatsby-source-wordpress  MenuItem - 1.795s - fetched 0
success  gatsby-source-wordpress  Menu - 1.813s - fetched 0
success  gatsby-source-wordpress  Taxonomy - 3.721s - fetched 3
success  gatsby-source-wordpress  UserRole - 3.746s - fetched 0
success  gatsby-source-wordpress  PostFormat - 5.094s - fetched 9
success  gatsby-source-wordpress  User - 7.137s - fetched 7
success  gatsby-source-wordpress  Tag - 9.145s - fetched 16
success  gatsby-source-wordpress  Page - 9.588s - fetched 16
success  gatsby-source-wordpress  Comment - 12.336s - fetched 31
success  gatsby-source-wordpress  Category - 13.844s - fetched 42
success  gatsby-source-wordpress  Post - 14.821s - fetched 35
success  gatsby-source-wordpress  MediaItem - 18.815s - fetched 40
success  gatsby-source-wordpress  creating nodes - 18.821s
success  gatsby-source-wordpress  fetching nodes - 33.677s - 202 total
success Downloading remote files - 17.654s - 40/40 2.27/s
success Checking for changed pages - 0.001s
success source and transform nodes - 33.795s
success building schema - 0.566s
success createPages - 0.001s
success createPagesStatefully - 0.043s
info Total nodes: 276, SitePage nodes: 4 (use --verbose for breakdown)
success Checking for changed pages - 0.001s
success write out redirect data - 0.002s
success Build manifest and related icons - 0.145s
success onPostBootstrap - 0.165s
info bootstrap finished - 45.796s
success onPreExtractQueries - 0.001s
success extract queries from components - 2.115s
success write out requires - 0.006s
success run page queries - 0.017s - 3/3 174.36/s
β €
You can now view issue-2343 in the browser.
β €
  http://localhost:8000/
β €
View GraphiQL, an in-browser IDE, to explore your site's data and schema
β €
  http://localhost:8000/___graphql
β €
Note that the development build is not optimized.
To create a production build, use gatsby build
β €
success Building development bundle - 13.333s
success Writing page-data.json files to public directory - 0.131s - 3/4 30.45/s

Still Gatsby is getting all the content as expected. I'm not able to reproduce this issue.

I think I will need more concrete steps on how to reproduce this in order to be helpful.

If you're able to identify specific queries in WPGraphQL that aren't working properly, that would be helpful.

This feels like a genuine regression of WPGraphQL 1.8, as you were able to revert to 1.7.2 to get back to a working state, but I'm unable to identify the conflict.

I would highly recommend doing a gatsby clean && gatsby develop with WPGraphQL v1.8 to see if that helps. And if you still have the issue, I'll need more information on the specific queries that are not resolving as expected, and probably more information on the data set, as it could possibly be related to a specific set of data that I don't have.

jasonbahl commented 2 years ago

@pauljavascripting I tried one more time by running the commands:

wp term generate tag --count=200
wp term generate category --count=300

Now, my site has 343 categories and 216 tags.

I ran gatsby clean && gatbsy develop once more, and still getting the expected results:

➜  gatsby git:(master) βœ— gatsby clean && gatsby develop
info Deleting .cache, public, /Users/jason.bahl/Sites/issue2343-gatsby/gatsby/no
de_modules/.cache/babel-loader, /Users/jason.bahl/Sites/issue2343-gatsby/gatsby/
node_modules/.cache/terser-webpack-plugin
info Successfully deleted directories

success compile gatsby files - 0.564s
success load gatsby config - 0.040s
success load plugins - 1.505s
success onPreInit - 0.005s
success initialize cache - 0.118s
success copy gatsby files - 0.171s
success Compiling Gatsby Functions - 0.308s
success onPreBootstrap - 0.320s
success  gatsby-source-wordpress  ensuring plugin requirements are met - 2.559s
β €
info  gatsby-source-wordpress 

    This is either your first build or the cache was cleared.
    Please wait while your WordPress data is synced to your Gatsby cache.

    Maybe now's a good time to get up and stretch? :D

success  gatsby-source-wordpress  ingest WPGraphQL schema - 2.834s
success createSchemaCustomization - 5.454s
success  gatsby-source-wordpress  fetch root fields - 0.529s
success  gatsby-source-wordpress  ContentType - 1.224s - fetched 3
success  gatsby-source-wordpress  MenuItem - 1.332s - fetched 0
success  gatsby-source-wordpress  Menu - 1.564s - fetched 0
success  gatsby-source-wordpress  Taxonomy - 3.020s - fetched 3
success  gatsby-source-wordpress  UserRole - 3.126s - fetched 0
success  gatsby-source-wordpress  PostFormat - 4.230s - fetched 9
success  gatsby-source-wordpress  User - 5.428s - fetched 7
success  gatsby-source-wordpress  Page - 7.582s - fetched 16
success  gatsby-source-wordpress  Comment - 10.312s - fetched 31
success  gatsby-source-wordpress  Post - 12.589s - fetched 35
success  gatsby-source-wordpress  Tag - 34.625s - fetched 216
success  gatsby-source-wordpress  Category - 51.829s - fetched 342
success  gatsby-source-wordpress  MediaItem - 17.432s - fetched 40
success  gatsby-source-wordpress  creating nodes - 17.443s
success  gatsby-source-wordpress  fetching nodes - 69.278s - 702 total
success Downloading remote files - 16.320s - 40/40 2.45/s
success Checking for changed pages - 0.001s
success source and transform nodes - 69.398s
success building schema - 0.571s
success createPages - 0.001s
success createPagesStatefully - 0.052s
info Total nodes: 776, SitePage nodes: 4 (use --verbose for breakdown)
success Checking for changed pages - 0.000s
success write out redirect data - 0.002s
success Build manifest and related icons - 0.123s
success onPostBootstrap - 0.126s
info bootstrap finished - 81.686s
success onPreExtractQueries - 0.001s
success extract queries from components - 2.024s
success write out requires - 0.007s
success run page queries - 0.016s - 3/3 191.46/s
β €
You can now view issue-2343 in the browser.
β €
  http://localhost:8000/
β €
View GraphiQL, an in-browser IDE, to explore your site's data and schema
β €
  http://localhost:8000/___graphql
β €
Note that the development build is not optimized.
To create a production build, use gatsby build
β €
success Building development bundle - 13.117s
success Writing page-data.json files to public directory - 0.109s - 3/4 36.62/s
justlevine commented 2 years ago

@jasonbahl the only other things I could think to try I try is exporting and reimporting those terms with xml (not the cli) and maybe generating some terms in a tax named category on a CPT to rule out conflicts between a tax id and a tax term id

TylerBarnes commented 2 years ago

In the past this error has usually shown up in gatsby-source-wordpress when a 3rd party WPGraphQL extension breaks pagination. I wonder if on 1.8.0 there's another plugin which needs to be updated to work with 1.8.0 and is breaking pagination for some types. A list of all installed WP plugins could be helpful but also if you can try disabling all plugins except WPGatsby and WPGraphQL to see if the error still shows up, that would be hugely helpful.

tsdexter commented 2 years ago

@jasonbahl I'll try to reproduce again - the problem I was having was that my staging server wasn't able to handle the sourcing (even with low config options) I may have fixed that and will see if I can reproduce there and then get you access.

tsdexter commented 2 years ago

@jasonbahl I think it possibly has to do with overloaded servers queuing/retrying queries (which may make sense as to why its getting duplicate IDs) ... Now that my server can handle the full build I can't seem to reproduce it. Previously, I either failed due to the duplicate IDs or eventually failed with the "your server might be overloaded" message - now the builds succeed.

jasonbahl commented 2 years ago

@tsdexter very interesting! Thanks for the follow up. If you run into the issue again, try and record as many steps as possible to reproduce. Would be helpful in determining if there's a legit regression with WPGraphQL, or something else happening, perhaps on the Gatsby side.

tsdexter commented 2 years ago

No problem @jasonbahl, I'm only working in dev for probably the next few months, so if I see it again, I'll post back and get you access

jasonbahl commented 2 years ago

I'm going to close this issue as I am unable to reproduce it. I've tried pretty hard to get into a reproducible state, but I'm not able.

If anyone is able to reproduce this, please open a new issue with steps to reproduce and link to this issue for context. πŸ™πŸ»

MikeLawton commented 2 years ago

@pauljavascripting a list of plugins would be a great start. Talking with someone in Slack, we think there might be a conflict with some custom ordering plugins, like: https://wordpress.org/plugins/post-types-order/

I was also having this issue; lots of random missing content in Gatsby GraphQL that was present in WpGraphiql and GraphQL Playground. Deactivating Post Types Order plugin solved the issue for me and now all content is available and the "Duplicate ID's" warnings have disappeared from terminal logs. πŸ₯³

justlevine commented 2 years ago

Seems here's another case which possibly points to #2294 and gatsby-source-wordpress as the culprit. Still no replication :/

justlevine commented 2 years ago

(from slack):

Well I think I found it :crossed_fingers:

The endCursor did change accidentally converting the array key it grabs from the base64 to integers, so arrayconnection:{tax_name} became arrayconnection:1 for all tax cursors :man-facepalming:. Got past the tests, since taxonomies dont (currently) test pagination.

Was even able to replicate in WPGraphiQL, note both the second and 3rd queries use different endCursors but get the same results:

image

Havn't tried with gatsby, but TylerB this would cause that fetch you linked to to keep requesting the same set of taxonomies over and over and over, right? I might not have time to get to a PR tonight, but if no one beats me to it i'll get on it over the weekend. (Should also double check that no other connection resolvers got the same bug from #2294)

TylerBarnes commented 2 years ago

Yes it would! Good find! πŸ™Œ

bashidagha commented 1 year ago

Simple gatsby clean && gatsby develop worked for me!