gatsbyjs / gatsby

The best React-based framework with performance, scalability and security built in.
https://www.gatsbyjs.com
MIT License
55.28k stars 10.31k forks source link

Fetch remote files sometimes has errors removing partial downloads #34297

Closed nrandell closed 2 years ago

nrandell commented 2 years ago

Preliminary Checks

Description

Sometimes when downloading lots of remote files, there are some weird connection errors on a slower link. All the retry logic in fetch remote files appears to work correctly, but sometimes on Windows, there is an issue removing the temporary file (removeSync). I think this is because the connection failed before the file was created.

When this happens the build fails with things like issue #16353

I've worked around it by wrapping removeSync in a try/catch in my local node_modules file node_modules/gatsby-core-utils/dist/fetch-remote-file.js

Reproduction Link

https://github.com/bond-london

Steps to Reproduce

It's hard to reproduce as I believe it's down to network errors happening really quickly before the temporary files get created.

Expected Result

The build should run ok.

Actual Result

The build fails with errors like

warn Error removing file Error: EPERM: operation not permitted, lstat '\\?\D:\source\customers\xxxx\.cache\caches\gatsby-source-filesystem\tmp-4f2784670a7cbeecba4f08b17195d13f.jpg'

Environment

System:
    OS: Windows 10 10.0.22000
    CPU: (16) x64 AMD Ryzen 7 PRO 4750G with Radeon Graphics
  Binaries:
    Node: 16.11.1 - ~\AppData\Local\Temp\yarn--1639999741494-0.4989217942192161\node.CMD
    Yarn: 1.22.11 - ~\AppData\Local\Temp\yarn--1639999741494-0.4989217942192161\yarn.CMD
    npm: 8.0.0 - C:\Program Files\nodejs\npm.CMD
  Languages:
    Python: 3.7.9
  Browsers:
    Edge: Spartan (44.22000.120.0), Chromium (96.0.1054.53), ChromiumDev (User home = 'C:\Users\nickr\AppData\Local\IISExpress'
IIS USER HOME configured)
  npmPackages:
    gatsby: ^4.4.0 => 4.4.0 
    gatsby-plugin-eslint: ^4.0.2 => 4.0.2 
    gatsby-plugin-gatsby-cloud: ^4.4.0 => 4.4.0 
    gatsby-plugin-image: ^2.4.0 => 2.4.0 
    gatsby-plugin-loadable-components-ssr: 4.1.1 => 4.1.1 
    gatsby-plugin-manifest: ^4.4.0 => 4.4.0 
    gatsby-plugin-page-creator: ^4.4.0 => 4.4.0 
    gatsby-plugin-perf-budgets: ^0.0.18 => 0.0.18 
    gatsby-plugin-postcss: ^5.4.0 => 5.4.0 
    gatsby-plugin-react-helmet: ^5.4.0 => 5.4.0 
    gatsby-plugin-robots-txt: ^1.6.14 => 1.6.14 
    gatsby-plugin-sharp: ^4.4.0 => 4.4.0 
    gatsby-plugin-sitemap: ^5.4.0 => 5.4.0 
    gatsby-plugin-ts-checker: ^1.1.0 => 1.1.0 
    gatsby-plugin-webpack-bundle-analyser-v2: ^1.1.26 => 1.1.26 
    gatsby-source-filesystem: ^4.4.0 => 4.4.0 
    gatsby-transformer-json: ^4.4.0 => 4.4.0 
    gatsby-transformer-sharp: ^4.4.0 => 4.4.0 
  npmGlobalPackages:
    @bond-london/gatsby-plugin-generate-typings: 2.0.0
    gatsby-source-graphcms: 2.0.0
    @bond-london/gatsby-transformer-extracted-svg: 2.0.1

Config Flags

FAST_DEV=true PRESERVE_FILE_DOWNLOAD_CACHE=true

LekoArts commented 2 years ago

Hi!

Sorry to hear you're running into an issue. To help us best begin debugging the underlying cause, it is incredibly helpful if you're able to create a minimal reproduction. This is a simplified example of the issue that makes it clear and obvious what the issue is and how we can begin to debug it.

If you're up for it, we'd very much appreciate if you could provide a minimal reproduction and we'll be able to take another look.

Thanks for using Gatsby! πŸ’œ

github-actions[bot] commented 2 years ago

Hiya!

This issue has gone quiet. Spooky quiet. πŸ‘»

We get a lot of issues, so we currently close issues after 60 days of inactivity. It’s been at least 20 days since the last update here. If we missed this issue or if you want to keep it open, please reply here. As a friendly reminder: the best way to see this issue, or any other, fixed is to open a Pull Request. Check out gatsby.dev/contribute for more information about opening PRs, triaging issues, and contributing!

Thanks for being a part of the Gatsby community! πŸ’ͺπŸ’œ

axe312ger commented 2 years ago

I have two projects that can reproduce this issue. I can not give public access but to single persons of the Gatsby core team. Contact me directly pls.

Both projects run on Gatsby v4, with the new gatsby-source-contentful using the remote file fetching from gatsby-core-utils.

They act very flaky, builds succeed, sometimes. It happens on my dev machine and on CI. Feels like happening more often on CI as my powerful dev machine. But still can happen locally. This means this is also happening on MacOS and Linux.

It always goes this way: Some image was downloaded, but not moved from the temporary location to the actual final location. Even on successful builds I end up with cache folders that still contain temporary files while there should be none left. Every different build a different image is the problem. Content and code did not change between the builds.

This is the directory content of .cache/caches/gatsby-source-contentful where the source plugin downloads the images.

Lets build the project

Run #1

...
β”œβ”€β”€ fd8b137b96d4e488d847076a385ed69c-7
β”œβ”€β”€ fda321a7157422000baa30ad56592f7e-4
β”œβ”€β”€ fdf3e12432adfe189a587660e326622a-4
β”œβ”€β”€ fe6440cced17f446b1566a91f571d2a8-7
β”œβ”€β”€ fe6b87e0a9ae8be2331dff41766a62b5-1
β”œβ”€β”€ feaa557ff4f18e36ad09e956b7a38b3e-4
β”œβ”€β”€ feb0c4775eee40cf1d45f0c14f7d5c77-4
β”œβ”€β”€ ff01af5bd9f7eefef6174e05c7bec242-4
β”œβ”€β”€ ff2a90190344d9e919d334f2bc1a3bed-1
β”œβ”€β”€ ff7d2c213c68d4c2ac18dfca584b3020-4
β”œβ”€β”€ ff9d325b1e75090300aefdac2d0b53c8-7
β”œβ”€β”€ ffb333cf1a355798bc9c4b4a91e4cbcb-7
β”œβ”€β”€ ffc29f4e8c34d2f8b3bf1954cac23a46-7
β”œβ”€β”€ tmp-02eb007e3fe507f4d3470bdfa1f0457c-2.jpg
β”œβ”€β”€ tmp-02eb007e3fe507f4d3470bdfa1f0457c-7.jpg
β”œβ”€β”€ tmp-03b6268501cb6d87223585647c33a1d8-2.jpg
β”œβ”€β”€ tmp-08240a8f6a24b19f0b76ab6841b9deda-2.jpg
β”œβ”€β”€ tmp-08240a8f6a24b19f0b76ab6841b9deda-7.jpg
β”œβ”€β”€ tmp-0e7aad5825f9300c9fd24dbffb561b67-7.jpg
...
β”œβ”€β”€ tmp-fae3b61784e001f20f37e1ad8c6f1f61-2.jpg
β”œβ”€β”€ tmp-fae3b61784e001f20f37e1ad8c6f1f61-7.jpg
β”œβ”€β”€ tmp-fbca47a485071b1eb20af6eca2ce9477-2.jpg
└── tmp-fbca47a485071b1eb20af6eca2ce9477-7.jpg

1193 directories, 129 files

The build failed. 129 files never got to their destination folder. Error message was something like this I just copied from another CI run that has the same issue:

error There was an error in your GraphQL query:

ENOENT: no such file or directory, open '/home/docker/actions-runner/_work/xxxx/xxxx/.cache/caches/gatsby-source-contentful/00a7cffc992832904543c5196417b227-5/xxxxxxx.jpg'

When I look up the image within the directory, I can find it as tmp-xxxxx.jpg in the plugin cache root dir. For whatever reason it was not moved, even while the file was downloaded properly. Even the download function must have returned already, otherwise the query would not have been executed yet AFAIK.

Alright, run #2

....
β”œβ”€β”€ tmp-d86ce348b9bde0d2d997fdffeb13c701-4.jpg
β”œβ”€β”€ tmp-db46a381427f1fff736aca5de71d6bb9-4.jpg
β”œβ”€β”€ tmp-dc6b65a58a97e33d93edb1163b2d0f4c-2.jpg
β”œβ”€β”€ tmp-e29efe13233b4c2a746d9d5ce614a24e-4.jpg
β”œβ”€β”€ tmp-ed8cfa0a5ac0131454a8acdae08dc328-4.jpg
β”œβ”€β”€ tmp-f96c5a124c52a617f45479aa0ab06497-4.jpg
β”œβ”€β”€ tmp-fb2c79ab396c33ef30476e3b089539fd-4.jpg
β”œβ”€β”€ tmp-fbffd530d409e4951252a0808272c8bc-4.jpg
β”œβ”€β”€ tmp-fe20c5115f534f2ed70113a4d6eb31f0-4.jpg
└── tmp-fe20c5115f534f2ed70113a4d6eb31f0-6.jpg

1635 directories, 72 files

YAY! It downloaded more files properly.. but not enough. Warm build is stuck forever. So leats gatsby clear and...

Try #3

....
β”œβ”€β”€ tmp-c974372342819b38aee0c37ba4450f47-4.jpg
β”œβ”€β”€ tmp-ca0e9b035ec37b76d17e6cb12c1cc8cb-6.jpg
β”œβ”€β”€ tmp-cd852e43a9bdb3ef6b6d53156cc43425-6.jpg
β”œβ”€β”€ tmp-d0d0801e086a5844ed5060bef9f6a7a4-6.jpg
β”œβ”€β”€ tmp-db13684f2c6b3936503f8831d42dfbe7-6.jpg
β”œβ”€β”€ tmp-dc6b65a58a97e33d93edb1163b2d0f4c-3.jpg
β”œβ”€β”€ tmp-def971e3c257d12f503cbae2d9df056d-4.jpg
β”œβ”€β”€ tmp-e95fa3ec4b954862f2633d0d69420880-4.jpg
β”œβ”€β”€ tmp-eb4bea91df05f9b6f8b422c6e5fbecea-4.jpg
β”œβ”€β”€ tmp-ed39077ac88668c34b3a0638590216ad-6.jpg
β”œβ”€β”€ tmp-ef4dd5fef9e9e767d2384ea487723140-4.jpg
β”œβ”€β”€ tmp-efa3f7001745000f163df90873f5af3a-4.jpg
β”œβ”€β”€ tmp-f73a4514b309b4f714b1ce2bb9e5b5fc-4.jpg
└── tmp-fa346eb449ade26e8618b33ba67fd2f6-6.jpg

1129 directories, 86 files

Other problematic images, other amount of files. Same problem in the end. Not building because of image not existing.

Try #4. We finally get a successful build:

....
β”œβ”€β”€ fefe30edd35cff3e3463c6e6e918baa9-1
β”œβ”€β”€ ff05c2140f5c0367522c67ef82b4d0b9-7
β”œβ”€β”€ ff5d5cc8cd85aa34b723b14ef9490b54-4
β”œβ”€β”€ ffce7f08a705e67d62c69da84678cfd1-2
β”œβ”€β”€ ffe267bbf904af3d2ff68190c6fbc59d-3
β”œβ”€β”€ ffe9ffd76dd38d04e68d4d90acc4a314-2
β”œβ”€β”€ fff333cf1bd8e37cb6659fe97288de94-2
β”œβ”€β”€ fff433bb85e3ff0114122377a3eae578-2
β”œβ”€β”€ tmp-1878a94c283adbeb1d479f34f6fb102c-6.jpg
β”œβ”€β”€ tmp-34bf09d7f2b593a3c02f1625ab145c2b-2.jpg
└── tmp-b68e22237ee31ee0fa49024454e905e4-2.jpg

1980 directories, 3 files

This was a successful build. The website build looks fine on a first glimpse.

I can just hope that nobody will realise that 3 images lack their placeholders... Warm builds now also work. For some reason.


As you can see, we have a very unpredictable situation here. I getting into trouble convincing my clients that the builds problems will be gone soon, thats what I say till months ago, when gatsby-source-contentful switched to gatsby-core-utils.

Please help, I really don't want to move gatsby-source-contentful back to axios with axios-retry. That was ugly to do... but it was reliable :sadface:

github-actions[bot] commented 2 years ago

Hiya!

This issue has gone quiet. Spooky quiet. πŸ‘»

We get a lot of issues, so we currently close issues after 60 days of inactivity. It’s been at least 20 days since the last update here. If we missed this issue or if you want to keep it open, please reply here. As a friendly reminder: the best way to see this issue, or any other, fixed is to open a Pull Request. Check out gatsby.dev/contribute for more information about opening PRs, triaging issues, and contributing!

Thanks for being a part of the Gatsby community! πŸ’ͺπŸ’œ

axe312ger commented 2 years ago

My problems got solved with gatsby@4.10. What about you?

nrandell commented 2 years ago

I updated our graphcms source plugin to cache all the remote assets locally in another directory. This removed the majority of my problems plus it had the added bonus of massively speeding up builds!

axe312ger commented 2 years ago

@nrandell alright, thats great. How did you do that?

I'd suggest you to check what version(s) you have of gatsby-core-utils. There have been a lot of updates to it recently and you might have dependency duplication.

nrandell commented 2 years ago

@axe312ger - that was pretty simple. It's all in https://github.com/bond-london/simple-gatsby-source-graphcms/blob/main/src/cacheGraphCmsAsset.ts - but basically I call createRemoteFileNode and then copy the file into my local cache folder when that completes. Next time I look for the file, I just copy from my cached version if it's there. It's similar to how the wordpress plugin works.

I keep this as up to date as possible as we use this plugin for most of our projects!

wardpeet commented 2 years ago

This should be resolved :) Closing