[Dicussion] require.resolve caching - Githubissues

linkedin / css-blocks

High performance, maintainable stylesheets.

http://css-blocks.com/

BSD 2-Clause "Simplified" License

6.33k stars 152 forks source link

[Dicussion] require.resolve caching #216

Open stefanpenner opened 5 years ago

stefanpenner commented 5 years ago

We noticed in eyeglass and ember-cli that repeated calls to require.resolve can be costly as it can introduce unexpected IO. I also noticed, that NodeJsImporter introduced recently, may have a similar issue.

My goal here is to have the discussion to ensure css-blocks does not share this problem and that I have not done any testing, so I have no evidence of an issue.

Given that, I figured we should have the discussion here to decide if:

the problem transfers/is at all relevant to css-blocks
if so, what constraints are we ok with
a path forward
is there a different problem worth solving that address the motivation (wasteful cycles spent during build)

Some Potential Constraints One may be able to rely on

css-blocks, right now operates in the node ecosystem, using require.resolve to ensure valid node resolution semantics. Node also uses require.resolve under the hood to resolve require, with the exception that it caches the result of require. This accomplishes two things:

ensures the result of require is frozen given its inputs
ensures the cost per unique require only occurs once
does provide an "Escape hatch" for power users to evict the cache.

For css-blocks, it would likely need the ability to evict a cache on a per build basis, ensuring no stale reads occur when someone expects new artifacts to be found.

Thoughts?

amiller-gh commented 5 years ago

I want to hold off on cutting a new release with the node_modules resolved until we have this... resolved 🙂

IMO caching the results from NodeJSImporter is the correct approach. It improves perf, mirrors the native Node.js behavior, and avoids a number of build integration hurdles for us down the road.

There are two methods we will need to cache in the NodeJSImporter: NodeJSImporter.identifier and NodeJSImporter.import. One resolves path on disk, the other resolves file contents.

As for cache purging, to enable incremental / watched builds we should implement a more granular cache purge than what Node.js offers. This will allow the watcher integrations to purge individual files that it knows have changed since last time. This isn't as important now since we don't have incremental builds working – you get a clean instance of the compiler each rebuild I believe – but will be required down the line.

We can add abstract purgeFile and purgeAll methods to the Importer base interface and add the appropriate implementations to NodeJSImporter to enable this.

amiller-gh commented 5 years ago

Or, we promote Importer interface to an abstract class, move the identifier() and import() methods to Importer and add out-of-the-box memoization. If there is a cache miss, we fall back to the sub-classer's protected getIdentifier() or getImport() methods, caching the result before returning. purgeFile() and purgeAll() methods can then be supplied by the base Importer class as well. This ensures consistent caching behavior across all possible future Importers.

amiller-gh commented 5 years ago

Anyway, tl;dr, I'll play with it and get a PR up 👍

chriseppstein commented 5 years ago

@amiller-gh I'm not convinced this isn't premature optimization because of block caching via the factory.

amiller-gh commented 5 years ago

@chriseppstein BlockFactory does cache and short circuit with the same Block before the importer called, but are there situations in the future where BlockFactories may share the same importer instance?

Ex: A custom webpack build where a consumer provides the same importer instance to two CSS Blocks builds (its passed in through the configuration hash).

amiller-gh commented 5 years ago

Also ex: as written, the ember-cli integration kicks off an independent build for every addon, app, or engine. Conceivably, with this change, we can have all independent builds share the same NodeJSImporter instance and reap some perf gains.

amiller-gh commented 5 years ago

Actually, yeah, as I look at this more and more I'm convinced that the Importer and BlockFactory need to have separate caches. For instance, consider:

/* a.block.css */
:scope { block-name: a; }

/* b.block.css */
@block a from "a.block.css";
:scope { extends: a; block-name: b; }

/* c.block.css */
@block b from "b.block.css";
:scope { extends: b; block-name: c; }

On a rebuild, if the file a.block.css is modified, we want to purge the in-memory Block caches for Blocks a, b, and c. However, the only Importer cache we need to purge is for a.block.css. There is no need to go re-resolving file locations for b.block.css and c.block.css as well, as it will currently do.

amiller-gh commented 5 years ago

When a file is modified, we want to purge the Importer cache for the specific file identifier, and then purge all in-memory Block caches for that file, and its down stream dependents. We can provide some assistance to build system integrations to simplify this type of cache purge task since we already have the dependency tree built out.

stefanpenner commented 5 years ago

@chriseppstein I think i see what your saying, heres what I think, correct me if I am wrong: the block based caching is unique to css-blocks and given that, it provides its own form of caching already. This then guards against the most egregious eyeglass related performance issues. Specifically that in eyeglass "reuse" would result in repeated work, but in css-blocks "reuse" actually reuses the existing work?

That would be a nice improvement over eyeglass (at-least prior to @eoneill's recent work).

So the open Q is related to cost/benefit of inter-block require.resolve caching correct?