Proposal: Solving the waterfall problem with depcache

WICG / import-maps

How to control the behavior of JavaScript imports

https://html.spec.whatwg.org/multipage/webappapis.html#import-maps

Other

2.66k stars 69 forks source link

Proposal: Solving the waterfall problem with depcache #209

Closed guybedford closed 2 years ago

guybedford commented 4 years ago

I'd like to propose a new "depcache" field in the import map, as an optimal solution to the waterfall problem for production optimization.

The core idea is to provide the ability for tools that generate import maps to also generate the metadata of the module graph as a sort of graph cache (dependency cache) at the same time - a map from URLs to their list of dependency specifiers.

With a populated depcache, as soon as a module is fetched, the depcache can be consulted, and the corresponding cached list of module dependencies preloaded in parallel immediately.

For example:

{
  "imports": {
    "main": "/main.js",
    "lazy-feature": "/lazy-feature.js",
    "dep1": "/lib/dep1.js",
    "dep2": "/lib/dep2.js"
  },
  "depcache": {
    "/main.js": ["dep1"],
    "/lazy-feature.js": ["dep1", "dep2"]
  }
}

Say the main app loads "main" on startup. This is resolved to /main.js, which the depcache allows us to see that we need to also load /lib/dep1.js immediately. This trace applies recursively to all requests as well. These known dependency requests are then made in parallel with the main request, possibly from the cache, the app thus loading with zero latency waterfalls applying.

Later on a dynamic import('lazy-feature') is executed. At this point, the depcache can again be consulted, to see that /lazy-feature.js will import both dep1 and dep2. Since /lib/dep1.js is already in the module map we do not need to send a new request, so we then immediately send out just two requests - one for /lazy-feature.js and one for /lib/dep2.js, again getting lazy loading with full immediate parallel requests, without any duplication and supporting far-future caching. No matter how deep the dependency tree, there is never a waterfall so long as there is a depcache entry.

Note that the unresolved dependency specifier is included in the depcache array. This allows the import map to remain the full source of truth for dependency resolution, and the depcache for eg a cached module doesn't go stale despite resolutions changing.

The alternative as mentioned in the readme of this spec is a more general preloading spec. I have discussed with various spec authors and implementors the more general preload spec over the past year or two, and have found there to be a lack of interest - mostly I think because for most web resources, 2-3 round trips is the maximum due to the nature of HTML / CSS development. I would argue that modules are actually unique in having the problem of potentially N levels deep dependency trees, where N can be over 10, thus the latency problem between these depths is truly unique to module graphs.

This depcache proposal seems to me the simplest path forward right now to solve this latency waterfall problem in a fully optimal way, given that a more general preload manifest is not getting traction, but I'm hoping by posting both this proposal and https://github.com/WICG/import-maps/issues/208, we can drive these conversations forward to ensure that this production optimization solution is tackled, as it really needs to be now for modules.

I'd like to also be able to move forward with this or a similar proposal in both SystemJS and ES Module Shims. Depcache as specified here has worked very well in previous versions of SystemJS for many years, and I'd like to start shipping this feature or similar in both projects again soon as it is a critical performance necessity right now for users of these projects today. Both projects aim to track these standards so hopefully we can continue these discussions and continue to solve these problems optimally.

daKmoR commented 4 years ago

as far as I can tell this sounds like quite a good idea 🤗

there is a small typo in the example - but it got me confused for a sec 🙈

"/main.js": ["dep"],
// should be
"/main.js": ["dep1"],

evmar commented 4 years ago

We used a technique like this at Google, and have recently moved away from it.

If you visit https://store.google.com/us/, view source, and search for _ModuleManager.initialize, you'll see it called with a big array of text. This is an encoding of the module dependency map (the 'depcache' part of your example data structure). Without going into the whole format, you can see the modules have short names like '3y'.

I used this store as an example here because it's an old enough site that it's using the old technique. What we discovered is that as your set of modules gets larger (apps like gmail have hundreds of modules -- after compilation, which bundles related code together from the thousands of modules the code authors work with) the module map itself becomes large enough that it becomes a big part of your initial page load. That is, because you can't load the module map itself lazily, the larger it gets the larger your upfront cost gets.

We moved to a system that requires a bit more server-side logic, where (following your initial example) the client makes a request like "I want lazy-feature, and I have already downloaded main and dep1". The server has a module map and can use that to send both lazy-feature and the dependency dep2 to the client in one response. Critically, note that in this exchange we never needed to send to the client any info about dep2 up front -- the client tells the server what it knows, and the server holds the module map. A dumber client requires less up-front data and logic, both of which help page load time.

I don't say this to kill your idea. I agree that the waterfall problem for fetching is a real problem, and your solution is similar to one that worked at Google for a long time. I just thought you might find the additional info interesting, and maybe it can help spark some more ideas.

iteriani commented 4 years ago

Over time as the user downloads more modules, we do end up downloading the module graph at some point once we exceed url limit for modules downloaded

hiroshige-g commented 4 years ago

Interesting! I'd like to clarify a couple of points:

What are the benefits to include this depcache in import maps (e.g. compared to introducing a separate manifest file/mechanism)? I suppose including this in import maps might be faster, conceptually clearer, while e.g. it might be suboptimal to be affected by import maps restrictions (e.g. import maps can't be modified after the first module loading is started, while if the depcache is a kind of preload mechanism and we don't have to worry so much about its correctness and consistency, we might want to add entries into depcache any time).
Is it conceptually good to place other similar things (e.g. other non-script-related preloading/dependency information) here? Or do you have criteria for what kind of things should be placed here together with depcache?

guybedford commented 4 years ago

@evmar thanks for the interesting feedback!

after compilation, which bundles related code together from the thousands of modules the code authors work with) the module map itself becomes large enough that it becomes a big part of your initial page load

I understand internal Google code development workflows are of course a little more intricate, but I think it is important to point out how module merging optimizations, now that they are well defined and well established, change the calculation. The production module graph should have much fewer module nodes (at least an order of magnitude) than the development module graph, effectively the natural typical module combinations formed under your got-want-need scheme below played out across all clients.

That is, because you can't load the module map itself lazily, the larger it gets the larger your upfront cost gets.

I do think it will be worth supporting lazy loading of import maps at some point. If we split up the page load into:

The modules that load on page load.
The modules that we know may or may not be lazy loaded, depending on user interactions.
The modules that we don't know will be lazy loaded.

An optimization if lazy loading of import maps is supported would be to separate import map (1) from import map(s) for (2), such that (1) is the initial import map, when after the page load, a new import map for (2) (or a few different ones) get lazily loaded in line with their priorities. This way, the initial page load is not slowed down by the growth of the mappings for lazy loads on the page.

This is starting to get out of scope for this thread, but handling how waiting for import maps works with lazy import map loading brings up important questions though, which I have tried to start considering in https://github.com/WICG/import-maps/issues/92#issuecomment-453578821. It may even be useful to define a platform callback for lazy loading of import maps when a mapping is not found enabling (3).

We moved to a system that requires a bit more server-side logic, where (following your initial example) the client makes a request like "I want lazy-feature, and I have already downloaded main and dep1".

One of the major benefits of depcache is that it can work for static server use cases. It would be a shame to have to tell users they can only get optimized delivery of their apps by using very specific server software, that must then have full knowledge of the module graph, and tying server software internals to application delivery. Server logic only really comes into its own for advanced cases of (2) and (3), and even then such work could even build on lazy import map loading too as described above.

@iteriani can you clarify what you mean by the url limit here:

we do end up downloading the module graph at some point once we exceed url limit for modules downloaded

Are you referring to reaching the limits of the module map size itself?

guybedford commented 4 years ago

@hiroshige-g

What are the benefits to include this depcache in import maps (e.g. compared to introducing a separate manifest file/mechanism)?

Your points are largely the main ones I think.

I could still get behind either approach, and this has been my opinion for a while. But this depcache proposal was posted out of my frustrations in getting nowhere after many continued preloading discussions with many different people.

Then my recent realization was that solving the waterfall problem for all web assets is simply not a primary concern for the web today due to most web assets having a limited tree depth, which is not a large enough factor in load time perception like it is for module graphs.

@evmar's first comment was also about how a feature like depcache can itself bloat the page load. It would be a shame if the preload manifest were to become so full-featured and verbose so as to slow down page loads. Trying to flesh it out in https://github.com/WICG/import-maps/issues/208 also gives some idea of the verbosity to expect. What's nice about depcache is it is very much just solving the direct problem without trying to walk down the path of a "manifest to rule all manifests", which a preload spec risks, which seems like it could get lost in the weeds as well.

Is it conceptually good to place other similar things (e.g. other non-script-related preloading/dependency information) here? Or do you have criteria for what kind of things should be placed here together with depcache?

This is a difficult one indeed and an important question. I really don't know how to set those criteria other than on a very careful case by case basis.

I suppose the general guide should be it could extend but only to features that relate only to modules. Under this logic, module attributes could be a possibility but integrity, fetch options and credentials would not be suitable for import maps, and whether import maps could include execution / capability / module security options seems unclear.

iteriani commented 4 years ago

Once the length of the url for requesting modules (in addition sending up the modules already loaded) exceeds 2048, we download the whole module map.

iteriani commented 4 years ago

I do think it's worth having both solutions (normal module loading vs negative module loading) for a wide variety of reasons. It's just interesting noting that high requirement customers will need a little bit mor than the current proposed solution. At Google we use a JavaScript framework that basically late loads event handlers, so having hundreds of even thousands of modules even after graph optimizations is possible.

I'm not sure how these two pieces of information will be figured out at request time:

The modules that load on page load. The modules that we know may or may not be lazy loaded, depending on user interactions.

Does this involve some sort of server-side rendering or statically built map?

fenghaolw commented 4 years ago

You might find this useful: https://github.com/azukaru/progressive-fetching/blob/master/docs/dynamic-bundling/index.md

It explains why large Google applications end up with thousands of modules even for production bundles. It does not mean thousands of requests, the client framework has the ability to pick up the subset of modules and load them together. As @iteriani said, the design principle is to "load the minimal amount of code that are needed", and this means we need to load different combinations based on different user interactions.

If you have thousands of production modules, then the mapping soon becomes a burden (>60KB for some applications). Which is why we move to a different system to avoid mapping.

jkrems commented 4 years ago

I think there's three different scenarios involved here:

A static file server serving ES modules directly. This is where depcache shines and - afaik - no viable alternative exists.
A smart file server serving dynamically generated resources. This allows skipping the depcache on the client in many cases (see @iteriani's restriction on exclusion list bloat) but there's some fundamental incompatibilities with how modules work.
A smart file server providing dynamically pruned web packages. This still requires either a complete exclusion list, a cache/load state manifest, or a depcache on the client to work (see Subresource bundling). It would also mean some mechanism to redirect (potentially dynamic) module imports to load a web package instead of the referenced resource.

Each come with their own assumption about how much implementation effort adoption can require and how many other in-progress specs/unknowns are involved. I don't think there's a practical proposal yet how to incrementally load a module graph using web packages.

guybedford commented 4 years ago

At Google we use a JavaScript framework that basically late loads event handlers, so having hundreds of even thousands of modules even after graph optimizations is possible.

You might find this useful: https://github.com/azukaru/progressive-fetching/blob/master/docs/dynamic-bundling/index.md

@iteriani @fenghaolw on a brief read, if I'm understanding correctly, this sounds like these modules are effectively AMD-like wrappers on the development modules. It's important that production modules are optimized based on module merging, such that the graph is much smaller, rather than individually wrapping each dev module with a function wrapper into the output chunk.

Does this involve some sort of server-side rendering or statically built map?

The same process that constructs the import map needs to know which dependencies to include in the import map. There is no reason to include dependencies which aren't loaded in the import map, Thus the optimal import map is one that is traced to just be what is necessary for the page load. This same tracing is the tracing needed to populate the depcache. It could be done dynamically or statically too.

jkrems commented 4 years ago

if I'm understanding correctly, this sounds like these modules are effectively AMD-like wrappers on the development modules.

It's complicated™ but generally they aren't AMD-like wrappers. It's doing true module merging. The difference is that it's doing it globally across the entire module graph (renaming identifiers to remove conflicts etc.) and then dynamically returns fragments of the entire "maximally merged module" on load. Each of the fragments is roughly equivalent to a merged module in a more conservative module merging approach (e.g. rollup). Paraphrasing a crafted example URL:

/*_M:FirstFragment*/
debug_track_module_execution_started('FirstFragment');
/* raw module body here with identifiers adjusted to remove conflicts */
debug_track_module_execution_done();

/*_M:SecondFragment*/
debug_track_module_execution_started('SecondFragment');
/* raw module body here with identifiers adjusted to remove conflicts */
debug_track_module_execution_done();

Worth nothing that this mostly works so nicely because it's compiling to a script in the end. That way it doesn't have to worry about re-linking across the concatenated fragments over time (it's all simply globals). That's why I said "there's some fundamental incompatibilities with how modules work" in this approach.

viT-1 commented 3 years ago

Why paths used not keys/names? Paths may be very long "vue-property-decorator": "https://cdn.jsdelivr.net/npm/vue-property-decorator@9.0.2/lib/vue-property-decorator.umd.min.js"

May be this variant?

{
  "imports": {
    "main": "/main.js",
    "lazy-feature": "/lazy-feature.js",
    "dep1": "/lib/dep1.js",
    "dep2": "/lib/dep2.js"
  },
  "depcache": {
    "main": ["dep1"],
    "lazy-feature": ["dep1", "dep2"]
  }
}

denghongcai commented 3 years ago

like preload? Alibaba use a format called 'seed' for a long time, seems solve the problem. http://g.alicdn.com/code/npm/@ali/pcom-dynamic/2.0.6/seed.json . any inspiration?

domenic commented 2 years ago

Let's keep this discussion in https://github.com/guybedford/import-maps-extensions.