gatsbyjs / gatsby

The best React-based framework with performance, scalability and security built in.
https://www.gatsbyjs.com
MIT License
55.27k stars 10.31k forks source link

GitLab CI: which directories can safely be cached? #20285

Closed willem-aart closed 4 years ago

willem-aart commented 4 years ago

Some background Right now, our GitLab CI pipeline takes ~ 10 minutes to complete. The generation of thumbnails seems to be a major factor:

[..]
success Building production JavaScript and CSS bundles - 58.614s
success Rewriting compilation hashes - 0.001s
success run queries - 77.167s - 42/42 0.54/s
success Generating image thumbnails - 232.075s - 760/760 3.27/s
success Building static HTML for pages - 15.758s - 36/36 2.28/s
info Done building in 272.523996685 sec
[..]

Build times are reduced drastically when .cache and public are cached in .gitlab-ci.yml:

[..]
success Building production JavaScript and CSS bundles - 29.286s
success run queries - 29.868s - 2/2 0.07/s
success Building static HTML for pages - 7.725s - 36/36 4.66/s
info Done building in 56.332283976 sec
[..]

Question In all Gatsby-related .gitlab-ci.yml-examples I've seen so far, only node_modules/ is cached. This leaves me wondering if it's safe to cache .cache and public in CI. If it's safe to do so, why doesn't this example list those folders?

Long story short: can both .cache and public safely be cached in a GitLab CI environment?

jonniebigodes commented 4 years ago

@willem-aart for normal cases those two folders shouldn't be added to either a repo or a ci environment. But there are some special cases like the one you're experiencing that by adding them will improve build times. Basically when using a heavy image site coupled with gatsby-image and the accompanying plugins. Without further knowledge of the content you're using you could do it. But be on the lookout if any stale data will popup.

willem-aart commented 4 years ago

Thanks @jonniebigodes!

for normal cases those two folders shouldn't be added to either a repo or a ci environment. [..] But be on the lookout if any stale data will popup.

Are there any known circumstances under which problems can be expected?

jonniebigodes commented 4 years ago

a while ago i encountered a issue with some stale images and one page that shouldn't exist because the it's source content was no longer present. Don't know if that's the case still i honestly haven't had the time to test it thoroughly, probably someone else more knowledgeable can give some further insights on this.

sidharthachatterjee commented 4 years ago

Long story short: can both .cache and public safely be cached in a GitLab CI environment?

@willem-aart You should be okay with caching both. Like @jonniebigodes correctly mentioned though, keeping public around might lead to some stale files being around. For example, a generated image thumbnail for a deleted image.

.cache however is perfectly okay to cache without worry and is in fact preferred.

If it's safe to do so, why doesn't this example list those folders?

Excellent question. Happy to receive a PR for this!

jooola commented 3 years ago

@sidharthachatterjee Hello, something seems inconsistent on whether to cache the public and/or .cache folders. In my Gitlab CI build logs I see the following:

[...]
info We've detected that the Gatsby cache is incomplete (the .cache directory exists
but the public directory does not). As a precaution, we're deleting your site's
cache to ensure there's no stale data.
success initialize cache - 0.063s
success copy gatsby files - 0.026s
success onPreBootstrap - 0.010s
success  gatsby-source-wordpress  ensuring plugin requirements are met - 0.730s
⠀
info  gatsby-source-wordpress  
    This is either your first build or the cache was cleared.
    Please wait while your WordPress data is synced to your Gatsby cache.
    Maybe now's a good time to get up and stretch? :D
[...]

So either the previous PR #22501 is missing a folder, or the deletion of the .cache folder is not correct.

What should we do ?

EDIT: Nevermind I didn't look at the full PR changes, a last commit did add the public folder.