gatsbyjs / gatsby

The best React-based framework with performance, scalability and security built in.
https://www.gatsbyjs.com
MIT License
55.25k stars 10.32k forks source link

Build slows down from 7 minutes to 100+ minutes in contentful + Gatsby implementation #27179

Closed lnilya closed 4 years ago

lnilya commented 4 years ago

Description

We are experiencing extremely slow builds for Gatsby + Contentful. The issue arises sporadically when content is changed in contentful. (Using bitbucket pipelines). Builds usually take 5-8 minutes on pipelines, but for no apparent reason suddenly things start taking up 90+ minutes and it stays this way, until I revert all the changes to this point. Reverting makes the build fast again. Republishing the reverted changes, yields a quick build as well. So at the end there seems to be nothing particular causing the bug. The code between the two builds does not change and bitbucket pipelines run a clean install every time including all libraries etc. but suddenly it takes ages to compile.

I have no idea how to debug it.

We get warnings about slow queries sporadically too: warn Query takes too long: File path: /Users/artifex/Work/Projects/ln-website/src/templates/ContactPage.tsx URL path: /kontakt ....

Another weirdness is that a blank project (that I created for testing), that simply pulls in the data from the same contentful repository, but does not display any content and executes only one gql-query for one page, compiles fine and fairly quickly no matter what. So it seems to be related to the conjunction of code, data and build chain.

Some more info:

This is the output for a build that runs as it should in about 7 minutes:

success createPages - 0.811s
success createPagesStatefully - 0.089s
success updating schema - 0.442s
success onPreExtractQueries - 0.002s
success extract queries from components - 2.127s
success write out redirect data - 0.002s
info [gatsby-plugin-graphql-codegen] definition for queries of schema default-gatsby-schema has been updated at graphql-types.ts
success Build manifest and related icons - 0.287s
success onPostBootstrap - 44.590s
info bootstrap finished - 89.415s
success run static queries - 1.381s - 11/11 7.96/s
success run page queries - 1.664s - 61/61 36.66/s
success write out requires - 0.005s
info Using stage environment
success created cachedVariables.json
success making environment variables available as globals
error [BABEL] Note: The code generator has deoptimised the styling of /opt/atlassian/pipelines/agent/build/graphql-types.ts as it exceeds the max of 500KB.
success Building production JavaScript and CSS bundles - 262.154s
success Rewriting compilation hashes - 0.003s
info Using stage environment
success created cachedVariables.json
success making environment variables available as globals
error [BABEL] Note: The code generator has deoptimised the styling of /opt/atlassian/pipelines/agent/build/graphql-types.ts as it exceeds the max of 500KB.
.
success Building static HTML for pages - 173.369s - 61/61 0.35/s
success Generating image thumbnails - 439.972s - 8/8 0.02/s
success onPostBuild - 0.002s
info Done building in 529.824 sec

This is the build that slows down to 90+ minutes

success createPages - 1.876s
success createPagesStatefully - 0.061s
success updating schema - 0.732s
success onPreExtractQueries - 0.001s
success extract queries from components - 3.152s
success write out redirect data - 0.001s
info [gatsby-plugin-graphql-codegen] definition for queries of schema default-gatsby-schema has been updated at graphql-types.ts
success Build manifest and related icons - 0.401s
success onPostBootstrap - 104.440s
info bootstrap finished - 198.792s
success run static queries - 2.828s - 11/11 3.89/s
success run page queries - 2.908s - 61/61 20.98/s
success write out requires - 0.009s
info Using stage environment
success created cachedVariables.json
success making environment variables available as globals
error [BABEL] Note: The code generator has deoptimised the styling of /opt/atlassian/pipelines/agent/build/graphql-types.ts as it exceeds the max of 500KB.
success Building production JavaScript and CSS bundles - 3334.661s
success Rewriting compilation hashes - 0.007s
info Using stage environment
success created cachedVariables.json
success making environment variables available as globals
error [BABEL] Note: The code generator has deoptimised the styling of /opt/atlassian/pipelines/agent/build/graphql-types.ts as it exceeds the max of 500KB.
.
success Building static HTML for pages - 2236.973s - 61/61 0.03/s
success Generating image thumbnails - 5580.829s - 8/8 0.00/s
success onPostBuild - 0.007s
info Done building in 5780.588 sec

Are there any ways to figure out what might be causing this issue and how to debug the process?

Steps to reproduce

This is incredibly hard to reproduce and I cannot give any access to our repositories, due to the commercial nature of the project. So hoping to get some help on debugging/figuring out what causes it.

Expected result

Build should run in a few minutes on pipelines.

Actual result

Build required 90+ minutes and uses insane amounts of memory.

Environment

Run gatsby info --clipboard in your project directory and paste the output here.


  System:
    OS: macOS 10.15.6
    CPU: (8) x64 Intel(R) Core(TM) i7-4870HQ CPU @ 2.50GHz
    Shell: 3.2.57 - /bin/bash
  Binaries:
    Node: 14.5.0 - /usr/local/bin/node
    npm: 6.14.5 - /usr/local/bin/npm
  Languages:
    Python: 2.7.16 - /usr/bin/python
  Browsers:
    Chrome: 85.0.4183.121
    Edge: 85.0.564.63
    Firefox: 73.0.1
    Safari: 13.1.2
  npmPackages:
    gatsby: ^2.24.2 => 2.24.2
    gatsby-background-image: ^1.1.1 => 1.1.1
    gatsby-env-variables: ^2.0.0 => 2.0.0
    gatsby-image: ^2.4.9 => 2.4.13
    gatsby-plugin-anchor-links: ^1.1.1 => 1.1.1
    gatsby-plugin-graphql-codegen: ^2.7.1 => 2.7.1
    gatsby-plugin-manifest: ^2.4.14 => 2.4.18
    gatsby-plugin-offline: ^3.2.18 => 3.2.18
    gatsby-plugin-page-transitions: ^1.0.8 => 1.0.8
    gatsby-plugin-react-helmet: ^3.3.6 => 3.3.10
    gatsby-plugin-sass: ^2.3.12 => 2.3.12
    gatsby-plugin-sharp: ^2.6.21 => 2.6.22
    gatsby-source-contentful: ^2.3.46 => 2.3.46
    gatsby-source-filesystem: ^2.3.14 => 2.3.19
    gatsby-transformer-remark: ^2.8.25 => 2.8.25
    gatsby-transformer-sharp: ^2.5.7 => 2.5.11
  npmGlobalPackages:
    gatsby-cli: 2.12.60
lnilya commented 4 years ago

Again sometimes while building locally I get the following error as well - maybe this helps somehow?

The "data" argument must be of type string or an instance of Buffer, TypedArray, or DataView. Received undefined

  Error: TypeError [ERR_INVALID_ARG_TYPE]: The "data" argument must be of type string or an instance of Buffer, TypedArray, or DataView. Received undefined

  - promises.js:542 Object.writeFile
    internal/fs/promises.js:542:5

  - extend-node-type.js:74 
    [love-nature-website]/[gatsby-source-contentful]/extend-node-type.js:74:19

  - base64-img.js:81 ClientRequest.<anonymous>
    [love-nature-website]/[base64-img]/base64-img.js:81:21

  - destroy.js:100 emitErrorNT
    internal/streams/destroy.js:100:8

  - destroy.js:68 emitErrorCloseNT
    internal/streams/destroy.js:68:3

  - task_queues.js:80 processTicksAndRejections
    internal/process/task_queues.js:80:21
lnilya commented 4 years ago

In search of a solution I disabled all base64 encoding by adding the _noBase64 to all image GatsbyContentfullFluid queries. It does not give the above mentioned error anymore, but the 2hr duration of the build still persists.

wardpeet commented 4 years ago

@oneextra Without a repository, there is not much we can do. Make sure you install the latest gatsby. A few things you could try is running gatsby build --verbose and paste in the information you see. We're mostly interested in:

info Total nodes: 33, SitePage nodes: 1 (use --verbose for breakdown)
verbose Number of node types: 7. Nodes per type: SitePage: 1, SitePlugin: 25, Site: 1, SiteBuildMetadata: 1, Directory: 1, File: 2,

gatsby-source-contentful@next has some improvements that you can try setting the env var EXPERIMENTAL_CONTENTFUL_SKIP_NORMALIZE_IDS to true.

It seems like something in your bundle is causing the slow down. How big is the generated js? Could you maybe send the js & sourcemaps over? You can reach me at ward@gatsbyjs.com

lnilya commented 4 years ago

@wardpeet Thanks for your reply and the offer to help us! I will check with legal, how much of repo I can share. Would love to have it resolved.

> BUILD_ENV=dev gatsby build --verbose

verbose set gatsby_log_level: "verbose"
verbose set gatsby_executing_command: "build"
verbose loading local command from: /Users/artifex/Work/Projects/love-nature-website/node_modules/gatsby/dist/commands/build.js
verbose running command: build
success open and validate gatsby-configs - 0.039s

 ERROR #11329 

Your plugins must export known APIs from their gatsby-browser.js.

See https://www.gatsbyjs.org/docs/browser-apis/ for the list of Gatsby browser APIs.

- The plugin gatsby-plugin-page-transitions@1.0.8 is using the API "replaceHistory" which is not a known API.

success load plugins - 2.064s
success onPreInit - 0.039s
success delete html and css files from previous builds - 0.023s
success initialize cache - 0.005s
success copy gatsby files - 0.077s
success onPreBootstrap - 0.023s
success createSchemaCustomization - 0.010s
Starting to fetch data from Contentful
info Fetching default locale
info default locale is: en-US
info contentTypes fetched 28
info Updated entries 3
info Deleted entries 0
info Updated assets 0
info Deleted assets 0
Fetch Contentful data: 1.960s
verbose Checking for deleted pages
verbose Deleted 0 pages
verbose Found 0 changed pages
success Checking for changed pages - 0.010s
success source and transform nodes - 2.218s
success building schema - 6.316s
info Total nodes: 509, SitePage nodes: 55 (use --verbose for breakdown)
verbose Number of node types: 76. Nodes per type: SitePage: 55, SitePlugin: 35, Site: 1, SiteBuildMetadata: 1,
success createPages - 0.433s
verbose Checking for deleted pages
verbose Deleted 0 pages
verbose Found 54 changed pages
success Checking for changed pages - 0.003s
success createPagesStatefully - 0.061s
success Cleaning up stale page-data - 0.005s
success update schema - 0.154s
success onPreExtractQueries - 0.002s
success extract queries from components - 0.962s
success write out redirect data - 0.002s
info [gatsby-plugin-graphql-codegen] definition for queries of schema default-gatsby-schema has been updated at graphql-types.ts
success Build manifest and related icons - 0.185s
success onPostBootstrap - 25.042s
info bootstrap finished - 42.983s
success run static queries - 0.030s - 2/2 67.25/s
success run page queries - 0.111s - 16/16 144.30/s
success write out requires - 0.005s
info Using dev environment
success created cachedVariables.json
success making environment variables available as globals

 ERROR 

[BABEL] Note: The code generator has deoptimised the styling of /Users/artifex/Work/Projects/love-nature-website/graphql-types.ts as it exceeds the max of 500KB.

success Building production JavaScript and CSS bundles - 39.903s
success Rewriting compilation hashes - 0.003s
info Using dev environment
success created cachedVariables.json
success making environment variables available as globals

 ERROR 

[BABEL] Note: The code generator has deoptimised the styling of /Users/artifex/Work/Projects/love-nature-website/graphql-types.ts as it exceeds the max of 500KB.

[===                         ]   15.393 s 6/56 11% Building static HTML for pages
success Building static HTML for pages - 16.253s - 56/56 3.45/s
success onPostBuild - 0.002s
info Done building in 100.118487461 sec

Here is the output of the build, it works now. and using Gatsby serve, page looks good.

After countless hours I think I narrowed the error down to https://github.com/gatsbyjs/gatsby/issues/11364 this issue. It seems to be some endless circular reference in gatsby-source-contentful. As far as I understand there was no official fix for that issue, only the workaround suggested in the ticket?

However right now I have a new problem and it only involves 'gatsby develop'. The development bundle is being built and then continuously rebuilt over and over again without any changes in the code or anything. I can't really access the local build or work on it.


...
⠀
  http://localhost:8000/
⠀
View GraphiQL, an in-browser IDE, to explore your site's data and schema
⠀
  http://localhost:8000/___graphql
⠀
Note that the development build is not optimized.
To create a production build, use gatsby build
⠀
success Building development bundle - 43.856s
success onPreExtractQueries - 0.004s
success extract queries from components - 0.165s
success write out requires - 0.004s
info [gatsby-plugin-graphql-codegen] definition for queries of schema default-gatsby-schema has been updated at graphql-types.ts
success Re-building development bundle - 37.893s
success onPreExtractQueries - 0.015s
success extract queries from components - 0.150s
success write out requires - 0.002s
success Re-building development bundle - 0.221s
info [gatsby-plugin-graphql-codegen] definition for queries of schema default-gatsby-schema has been updated at graphql-types.ts
success onPreExtractQueries - 0.015s
success extract queries from components - 0.108s
success write out requires - 0.002s
success Re-building development bundle - 0.200s
info [gatsby-plugin-graphql-codegen] definition for queries of schema default-gatsby-schema has been updated at graphql-types.ts
success onPreExtractQueries - 0.015s
success extract queries from components - 0.139s
success write out requires - 0.002s
success Re-building development bundle - 0.198s
...

Using the newest Gatsby and -source-contentful plugins. Will post updates here and get in touch with you. Thank you again - this issue is really throwing us off track here with a looming release in a few days.

lnilya commented 4 years ago

after further trial and error... now it seems that gatsby-plugin-graphql-codegen is breaking my build and letting it become and endless loop. Removed it, works really fast and fine now, using the generated graphql-types file so the code still works and doesn't create any errors and should continue doing so, unless I change the model in contentful.

The documentation for gatsby-plugin-graphql-codegen says, that if the graphql-types file is put into the src folder it will create an endless loop during development. The error sounds exactly like what I am experiencing but my graphql-types is not in source, it is on the root level with gatsby-config, -node etc. Is it imaginable that it creates this endless loop under some different other condition?

lnilya commented 4 years ago

is there maybe a possibility to configure gatsby-plugin-graphql-codegen so that it only runs once the build process is started and not on every change?

wardpeet commented 4 years ago

gatsby-plugin-graphql-codegen is a custom plugin and hooks into our redux store which is a private API that we use for internal usage. There is no proper API to do this so he has no choice but we can't disable it from a gatsby point of view on updates, you could ask the author to add a feature there.

Any chance you can update gatsby cause I think the rebuilding part is fixed by https://github.com/gatsbyjs/gatsby/pull/26940 which is in gatsby@2.24.63

lnilya commented 4 years ago

@wardpeet I am using 2.24.66 for this endless compilation to occur. So unfortunately this does not resolve it. I am pretty sure though that this is a thing of either "gatsby-plugin-graphql-codegen" or "gatsby-plugin-graphql-codegen + gatsby". The workaround of commenting and uncommenting the plugin whenever I have substantial changes in the queries, works alright so far. I will check the gatsby-plugin-graphql-codegen repository! Thanks

lnilya commented 4 years ago

TL;DR if anyone ever comes across it:

  1. Remove links in your rich text fields
  2. add _noBase64 for all your images.
  3. if you are using "@contentful/gatsby-transformer-contentful-richtext", don't. just use "documentToReactComponents" instead and only pull the json from your rich text fields. => At least a 30% build performance increase for us.
  4. uncomment "gatsby-plugin-graphql-codegen" if you are using it, and comment it back on, only on query changes. Also makes build about 30% faster.

If builds starts running again, add your images, and slowly start adding in the links. I am not a 100% sure, but the link issue might have been related to gatsby-plugin-graphql-codegen as well.

=> Make frequent backups of contentful via their cli:

//to pull out your data
contentful space export --space-id xxxxxxxxxx  

//to push it back in
contentful space import --space-id xxxxxxxxxx --environment-id xxx --content-file xxxxxx.json 

Thanks for all the support here!