parcel-bundler / parcel

The zero configuration build tool for the web. 📦🚀
https://parceljs.org
MIT License
43.39k stars 2.26k forks source link

Improving Caching on GitHub Actions for Unchanged Assets between Commits #8781

Open navidemad opened 1 year ago

navidemad commented 1 year ago

Thank you for your open-source build tool. It is very fast and efficient.

I am currently trying to optimize my GitHub action by implementing caching.

🐞 Problem

Even though my stylesheets assets remain unchanged between two commits, the building step takes around 41210ms for parcel to build. However, when I run the command multiple times yarn build:css on my local computer with stylesheets assets unchanged, it only takes 415ms each time.

Source: https://parceljs.org/features/development/#caching

Parcel caches everything it builds to disk. If you restart the dev server, Parcel will only rebuild files that have changed since the last time it ran. Parcel automatically tracks all of the files, configuration, plugins, and dev dependencies that are involved in your build, and granularly invalidates the cache when something changes. For example, if you change a configuration file, all of the source files that rely on that configuration will be rebuilt. By default, the cache is stored in the .parcel-cache folder inside your project.

This issue seems to be similar if it is helpful for you https://github.com/parcel-bundler/parcel/discussions/5068 https://github.com/parcel-bundler/parcel/issues/5927#issuecomment-1250395079

Please let me know if you require any additional information.

🔦 Context

Rails on Rails 7.0.4.1 with cssbundling-rails and @parcel/core 2.8.3

import { Parcel } from "@parcel/core";
import glob from "glob";
import path from "path";

const __dirname = path.resolve();

let bundler = new Parcel({
  entries: glob.sync(path.resolve(__dirname, `app/assets/stylesheets/*.scss`)),
  defaultConfig: "@parcel/config-default",
  defaultTargetOptions: {
    distDir: path.resolve(__dirname, "app/assets/builds"),
    sourceMap: true,
  },
  mode: "production",
  minify: true,
  env: {
    NODE_ENV: "production",
  },
});

try {
    let { buildTime } = await bundler.run();
    console.log(`compilation css: ${buildTime}ms`);
} catch (err) {
    console.error(`compilation css:`, err.diagnostics);
}
- name: 📂 Cache /.parcel-cache
  uses: actions/cache@v3
  with:
    path: |
      .parcel-cache
    key: ${{ runner.os }}-parcel-cache-${{ github.event.pull_request.head.ref || github.ref }}-${{ github.event.pull_request.head.sha || github.sha }}
    restore-keys: |
      ${{ runner.os }}-parcel-cache-${{ github.event.pull_request.head.ref || github.ref }}-
      ${{ runner.os }}-parcel-cache-

- name: 📂 Cache /app/assets/builds
  uses: actions/cache@v3
  with:
    path: |
      app/assets/builds
    key: ${{ runner.os }}-app-assets-builds-${{ github.event.pull_request.head.ref || github.ref }}-${{ github.event.pull_request.head.sha || github.sha }}
    restore-keys: |
      ${{ runner.os }}-app-assets-builds-${{ github.event.pull_request.head.ref || github.ref }}-
      ${{ runner.os }}-app-assets-builds-

- name: Precompile assets with parcel
  shell: bash
  run: bin/yarn build:css

Which outputs on the CI:

  bin/yarn build:css
  shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
  env:
    RAILS_ENV: test
    NODE_ENV: test
    RAKE_ENV: test
    CI: true  
compilation css
compilation css: 41210ms

The cache is correctly created and restore

Post 📂 Cache /app/assets/builds
Post job cleanup.
/usr/bin/tar --posix -cf cache.tzst --exclude cache.tzst -P -C /home/runner/work/foo-bar/foo-bar --files-from manifest.txt --use-compress-program zstdmt
Cache Size: ~10 MB (9976485 B)
Cache saved successfully
Cache saved with key: Linux-app-assets-builds-enhancements-system-tests-assets-40cf64044436c0198afd4a4c121a17e0d514fb55

Post 📂 Cache /.parcel-cache
Post job cleanup.
/usr/bin/tar --posix -cf cache.tzst --exclude cache.tzst -P -C /home/runner/work/foo-bar/foo-bar --files-from manifest.txt --use-compress-program zstdmt
Cache Size: ~28 MB (29308801 B)
Cache saved successfully
Cache saved with key: Linux-parcel-cache-enhancements-system-tests-assets-40cf64044436c0198afd4a4c121a17e0d514fb55

Let me know if you need more informations.

mischnic commented 1 year ago

The cache doesn't work effectively on CI yet, because from Parcel's perspective, every file has a modification time of the last checkout (i.e. a few seconds ago) which doesn't match the mtime that's listed in the cache. So every asset is treated as changed.

navidemad commented 1 year ago

Thank you for your reply.

We could restore the mtime to the most recent commit the files individually had been changed with third party tools, is it something that can help make it more effective on CI ?

Or perhaps, storing an md5 signature of each file inatead of looking up for the mtime ? But i guess this will occurs some performance issue.

mischnic commented 1 year ago

We could restore the mtime to the most recent commit the files individually had been changed with third party tools, is it something that can help make it more effective on CI ?

Yes, that is how I would imagine it to work. Though it should be considered by Parcel itself so that you don't have to actually set the mtime for each file (so where Parcel currently reads the mtime, it should instead use the commit time for unchanged files, and mtime for uncommited changes), this is also something that Facebook's watchman supports (and thus probably also more of their tooling) does.

The other problem is node_modules, where you can't just take the mtime from the last commit.

So we definitely want to support this, but there are some hurdles and nobody had enough time yet to build it.

navidemad commented 1 year ago

Alright 👍 Thanks for the update @mischnic I'm not qualified enough to make this PR, but let me know if you need stuff to test.

jahudka commented 1 year ago

This is something I'm interested in as well, but not just in a CI context. It would be amazing if this could be configurable to cover multiple use cases - e.g. I have a use case where I don't have a local Git repo, and there are a lot of images which get processed by Sharp, and the project is sometimes moved to a different root path - so mtimes and full resolved absolute paths are problematic with caching - it'd be great if I could opt in to caching based on local asset paths (if that's not the current behaviour already, which I don't know) and modification checks based on content hash - the performance penalty is negligible compared to several minutes of resizing images.

navidemad commented 1 year ago

Does something has changed ? My CI is now correctly using cache and i'am constantly having a compilation css of 2 seconds instead of 41 seconds previously.

mischnic commented 1 year ago

No, unless you were on a very old Parcel version that wasn't using LighningCSS yet (and this is just faster uncached)

touzoku commented 1 year ago

Instead of relying on the mtime, it would be better to use a fast file hash algorithm like xxHash.