gregberge / loadable-components

The recommended Code Splitting library for React āœ‚ļøāœØ
https://loadable-components.com
MIT License
7.64k stars 381 forks source link

loadable not working in SSR with webpack module federation - #640

Closed ajayjaggi closed 2 years ago

ajayjaggi commented 3 years ago

šŸ› Bug Report

loadable-components: failed to synchronously load component, which expected to be available { fileName: './src/shared/dedicated/index.js', chunkName: 'Dedicated', error: 'Cannot read property \'call\' of undefined' } (node:7562) UnhandledPromiseRejectionWarning: TypeError: Cannot read property 'call' of undefined at webpack_require (/Users/ajay/Desktop/WebStrom/MicroFrontend/Edge/dist/server/main.js:394:42) at Module../src/shared/components/Footer/index.js (/Users/ajay/Desktop/WebStrom/MicroFrontend/Edge/dist/server/Dedicated.server.js:21:71) at webpack_require (/Users/ajay/Desktop/WebStrom/MicroFrontend/Edge/dist/server/main.js:394:42) at Module../src/shared/dedicated/index.js (/Users/ajay/Desktop/WebStrom/MicroFrontend/Edge/dist/server/Dedicated.server.js:60:76) at __webpack_require__ (/Users/ajay/Desktop/WebStrom/MicroFrontend/Edge/dist/server/main.js:394:42) at Object.requireSync (/Users/ajay/Desktop/WebStrom/MicroFrontend/Edge/dist/server/src_server_render_js.server.js:225:14) at InnerLoadable.loadSync (/Users/ajay/Desktop/WebStrom/MicroFrontend/Edge/dist/server/vendors-node_modules_loadable_server_lib_index_js-node_modules_express_index_js-nodemodules-ee7ccd.js:420:35) at new InnerLoadable (/Users/ajay/Desktop/WebStrom/MicroFrontend/Edge/dist/server/vendors-node_modules_loadable_server_lib_index_js-node_modules_express_index_js-nodemodules-ee7ccd.js:315:17) at processChild (/Users/ajay/Desktop/WebStrom/MicroFrontend/Edge/dist/server/vendors-node_modules_loadable_server_lib_index_js-node_modules_express_index_js-nodemodules-ee7ccd.js:56603:14) at resolve (/Users/ajay/Desktop/WebStrom/MicroFrontend/Edge/dist/server/vendors-node_modules_loadable_server_lib_index_js-node_modules_express_index_js-nodemodules-ee7ccd.js:56568:5) (node:7562) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 1) (node:7562) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.

To Reproduce

routes.js -

import React from 'react' import loadable from '@loadable/component' // import Home from './home' // import Dedicated from './dedicated'

const Home = loadable(() => import(/ webpackChunkName: "Home" / './home')) const Dedicated = loadable( () => import(/ webpackChunkName: "Dedicated" / './dedicated'))

const homeRoute = (path) => ({ path, exact: true, component: Home })

const dedicatedRoute = (path) => ({ path, exact: true, component: Dedicated })

export default () => [ homeRoute('/'), dedicatedRoute('/:player(messi)') ]

Expected behavior

A clear and concise description of what you expected to happen.

Link to repl or repo (highly encouraged)

https://github.com/ajayjaggi/MicroForntEnd-Basic-Structure

Issues without a reproduction link are likely to stall.

ScriptedAlchemy commented 2 years ago

Okay so i juuuust finished implementing chunk flushing and federated SSR + nested remote import chunk flushing for Next.js - this might give you some ideas.

This is a copy/paste of what I've got so far. Note: this required a hack to take over react loadable since next.js provides no useful ways to break into the chunk flushing mechanism. However, it should give you some idea on how to flush chunks out for federated modules.

Nested federated modules (like a->b->c) require doing a "reverse reverse" lookup, at least in next.js since I need to locate the moduleID of the remote containers remote-module type and cast it back to whatever mechanism holds the array of module ids to wait for.


const extractChunkCorrelation = (remoteContainer, lookup, request) => {
  if (
    remoteContainer &&
    remoteContainer.chunkMap &&
    remoteContainer.chunkMap.federatedModules
  ) {
    const path = remoteContainer.path.split("@")[1];
    const [baseurl] = path.split("static/ssr");
    remoteContainer.chunkMap.federatedModules.map((federatedRemote) => {
      federatedRemote.exposes[request].forEach((remoteChunks) => {
        remoteChunks.chunks.map((chunk) => {
          if (!lookup.files.includes(new URL(chunk, baseurl).href)) {
            lookup.files.push(new URL(chunk, baseurl).href);
          }
        });
      });
    });
  } else {
    console.warn(
      "Module Federation:",
      "no federated modules in chunk map OR experiments.flushChunks is disabled"
    );
  }
};
const requireMethod =
  typeof __non_webpack_require__ !== "undefined"
    ? __non_webpack_require__
    : require;
const requestPath = path.join(
  process.cwd(),
  ".next",
  "server/pages",
  "../../react-loadable-manifest.json"
);
let remotes = {};
const loadableManifest = requireMethod(requestPath);
requireMethod.cache[requestPath].exports = new Proxy(loadableManifest, {
  get(target, prop, receiver) {
    if (!target[prop]) {
      let remoteImport = prop.split("->")[1]

      if (remoteImport) {
        remoteImport = remoteImport.trim();
        const [remote, module] = remoteImport.split("/");

        if (!remotes[remote]) {
          Object.assign(remotes,generateDynamicRemoteScript(global.loadedRemotes[remote]))
        }

        const dynamicLoadableManifestItem = {
          id: prop,
          files: [],
        };
        // TODO: figure out who is requesting module
        let remoteModuleContainerId
          Object.values(global.loadedRemotes).find((remote)=> {
          if(remote.chunkMap && remote.chunkMap.federatedModules[0] && remote.chunkMap.federatedModules[0].remoteModules) {
            if(remote.chunkMap.federatedModules[0].remoteModules[remoteImport]) {
              remoteModuleContainerId = remote.chunkMap.federatedModules[0].remoteModules[remoteImport]
              return true
            }
          }
        })
        if(remoteModuleContainerId) {
          dynamicLoadableManifestItem.id = remoteModuleContainerId
        }
        extractChunkCorrelation(
          global.loadedRemotes[remote],
          dynamicLoadableManifestItem,
          `./${module}`
        );
        return dynamicLoadableManifestItem;
      }
    }
    return target[prop];
  },
});

const flushChunks = async (remoteEnvVar = process.env.REMOTES) => {
  remotes = {}
  const remoteKeys = Object.keys(remoteEnvVar);
  const preload = [];

  try {
    for (const key in loadableManifest) {
      const [where, what] = key.split("->");
      const request = what.trim();
      const foundFederatedImport = remoteKeys.find((remoteKey) => {
        return request.startsWith(`${remoteKey}/`);
      });
      if (foundFederatedImport) {
        const remotePreload = remoteEnvVar[foundFederatedImport]().then(
          (remoteContainer) => {
            Object.assign(remotes,generateDynamicRemoteScript(remoteContainer))

            const inferRequest = what.split(`${foundFederatedImport}/`)[1];
            const request = `./${inferRequest}`;
            extractChunkCorrelation(
              remoteContainer,
              loadableManifest[key],
              request
            );
          }
        );
        preload.push(remotePreload);
      }
    }
    await Promise.all(preload);
    return remotes
  } catch (e) {
    console.error("Module Federation: Could not flush chunks", e);
  }
  return [];
};
theKashey commented 2 years ago

šŸ™„ not sure this solution is portable. Let's tackle it step by step. There are good (and small) examples of federation, so let's eat bear in parts.

The only operation loadable need - is to map "filename" into remote script. Currently it's made via naming all chunks manually and at the build time, what is definitely not gonna work with remotes.

So let's break everything into pieces and eat bear in parts.

ScriptedAlchemy commented 2 years ago

@theKashey lets find some time to dig into an example and work on a solution. From my end I feel this is simple to resolve with the help of strong dev like yourself

IMalyugin commented 2 years ago

@ScriptedAlchemy @theKashey

Hey guys! Spent last few weeks hacking into wmf+loadable, and after playing around with what's available at the moment, there are still a few unsolved issues. Might require some more heavy plugin loading to fix them:

  1. The first issue is explained in the comments to PR by DanielAmenou: https://github.com/module-federation/module-federation-examples/pull/1082#discussion_r727130871 . There is a tiny hack to preload wmf modules, so that loadable gets to see them, I believe it can be solved either by writing a dynamic mf resolver, or by giving loadable a basic idea on how to trigger mwf preload.
  2. The second issue is a bit deeper, as nested loadable import does not work. Module federation can resolve them just fine, but when Importing mf1/comp -> mf2/comp -> mf3/comp, the last one renders empty.

Alternatively both these issues can be addressed by allowing WMF to run synchronously, currently the closest we can get is declaring filesystem path in ModuleFederationPlugin. But that still creates a require/promise chain of resolution, making the process asynchronous.

Any ideas on these? And it goes without telling that the combination of the two (WMF+loadable) seems like a number one candidate to becoming the future of enterprise frontend architecture :) So amazing job on supporting them.

--- edit

If somebody else hits issue number 2, it is caused by using different instances of @loadable/component. Easiest way to fix it, is just adding @loadable/component to webpack.externals

ScriptedAlchemy commented 2 years ago

So nested imports I've solved on nextjs by having the whole reverse reverse lookup to remotes also report their remotes and every remote container has its chunk maps registered

IMalyugin commented 2 years ago

As for the nested import, it's actually working server-side. As for client - reverse lookup seems legit.

The other issue, although requires some attention. What I could dig up to this moment:

When one uses synchronous import import, the chunk gets bound to the parent chunk, that imports it (the right-side of the diff), this way @loadable/component works. But when we remove synchronous import (left-side of the diff), chunk is moved to another owner, aka "mf2/Component", and loadable is unable to resolve it. image

Code wise, it looks like this (left-side broken, right-side working): image

Left-side results in: image

-- edit Going further, chunkMapping results in WMF loading chunks in __webpack_modules__ during initialization: image

Removing synchronous import, causes the module to disappear, so it's expected to load asynchronously later on: image

ScriptedAlchemy commented 2 years ago

Sync imports also require extra tracking. So you have to know what chunks were user from a foreign build and apply those (reverse reverse lookup)

IMalyugin commented 2 years ago

I'm thinking more about compilation-based solution. Since loadable itself turns async imports sync ones, and WMF is able to properly chain external packages as internal. Then there are two viable alternatives:

  1. (straight solution) modify @loadable/component requireSync, so that it is able to resolve WMF modules. If for some reason entirely sync format is impossible, we can create server side loadableReady - that would await all the WMF remote chunks.
  2. (hacky solution) modify @loadable/babel-plugin, so that it adds unused synchronous import next to @loadable call. This way during prebuild, it would tell WMF to bundle chunks closely for SSR.
ScriptedAlchemy commented 2 years ago

What happens if remote imports more remotes?

Static import is one thing that has to be mapped, tho usually easy as long as something tells you want chunk ID was used to perform a reverse lookup. Remotes themselves would need to contain or report their nested async imports as well so loadable can find them and resolve them during the flush

IMalyugin commented 2 years ago

What happens if remote imports more remotes?

Static import is one thing that has to be mapped, tho usually easy as long as something tells you want chunk ID was used to perform a reverse lookup. Remotes themselves would need to contain or report their nested async imports as well so loadable can find them and resolve them during the flush

I've tested remotes importing more remotes and it works fine server-side as long as static imports are used alongside loadable call within each of those remotes.

Uploaded a WIP PR with working 2nd solution: https://github.com/gregberge/loadable-components/pull/883

As for the first solution. I currently don't have enough understanding of how loadable performs requireSync. From what I do know tho, is that the component chunk (that crashes with call of undefined), actually gets executed in nodejs without synchronous call (which is super weird). So it seems to me that issue is not in the way WMF handles remote chunk resolution, but in how loadable accesses remote chunks, that are, in fact already loaded in webpack, but not in webpack_modules. I tried manually putting components in webpack_modules using dynamic WMF configuration, but for some reason components turned into a pumpking (empty JS object). My guess is: some dark loadable magic involved.


P.S. Guess we need @theKashey from this point on :)

theKashey commented 2 years ago

SSR part of Loadable really expects all referenced modules to support requireSync (because that's nodejs require works) Adding static imports aids SSR, but makes one big monolith from the clientside application. This solution works, but does not provide the desired "splitting" functionality.

In order to support WMF we need to change API a little bit, as well as reconfigure babel plugin. However all this time the main friction comes from webpack integration (loadable hooks into jsonpcallback).

I am afraid that probably the only way forward is to keep that actual "loadable-component", but totally rewrite everything else.

IMalyugin commented 2 years ago

SSR part of Loadable really expects all referenced modules to support requireSync (because that's nodejs require works) Adding static imports aids SSR, but makes one big monolith from the clientside application. This solution works, but does not provide the desired "splitting" functionality.

In order to support WMF we need to change API a little bit, as well as reconfigure babel plugin. However all this time the main friction comes from webpack integration (loadable hooks into jsonpcallback).

I am afraid that probably the only way forward is to keep that actual "loadable-component", but totally rewrite everything else.

@theKashey That is not entirely right. With introduction to eager consumption, module federation static import is the new dynamic. It does not break chunk splitting, but it might actually modify when the chunks load. Not to mention that static import is the only way the current WMF example works in the first place.

I couldnā€™t make the application work on client side yet by disabling the plugin, but we can detect if babel is bundling for the server and inject static import then. That'll probably work for both SSR and CSR.

--- edit Tested CSR with static import disabled. It fails on the same "failed to synchronously load component, which expected to be available". If I call RemoteComponent.load() or RemoteComponent.preload() - and add timeout, it works again. The weird part is that amount of chunks loaded if i keep static import or remove it is the same.

So to sum it up: loadable does not do its magic for components loaded from remotes, even if the chunks are available (That is true for both client and server side). Is that related to chunkLoadingGlobal callback?

ScriptedAlchemy commented 2 years ago

This is solvable. Unlikely we need a big rewrite. after the enormous effort it took to get nextjs to work end to end with WFM altering a library will be much easier.

I can also change any aspect of module Federation. AOT manifest stitching wonā€™t work. Like loadable, this must be done during execution.

Nothing stops me from replacing the entire webpack runtime, itā€™s like 3 plugins.

Someone PR my SSR example to repro the problem. In MF examples. From there we can organize some action.

I fail to see the roadblock based on my knowledge of loadable-like libraries and webpack mechanics.

Not easy, but not hard either

ScriptedAlchemy commented 2 years ago

I donā€™t believe we need access to jsonp, but if we did.. itā€™s a plugin. I already use custom targets instead of target:node - at that point Iā€™ve got complete control over every runtime requirement. Hooks donā€™t matter at that point. You are the webpack runtime in totality. I doubt we need to go that far though.

Not sure how you all are loading MF on the server, but on my end - I control the filesystem functions directly from webpack runtime. MF middleware could also likely solve this too

IMalyugin commented 2 years ago

I'm sure it's totally solvable. I just got it running on the client with static import only on server-side. The thing that crashed me was actually none other than "loadableReady". So removing loadableReady, react renders client side properly.

So for me, uncommenting line 9 breaks the client side hydration process:

image

@ScriptedAlchemy The problem is reproduced by removing this one line with static import on the server: https://github.com/module-federation/module-federation-examples/pull/1082#discussion_r727130871

I've got a feeling that this issue is resolved with one liner somewhere in @loadable codebase. Because WMF actually does everything right. But loadable simply can't access wmf loaded remotes.

-- edit 1&2 Okay so loadableReady crashes client build, because remote chunks are marked in LOADABLE_SHARED.initialChunks. That results in:

  1. InnerLoadable calling loadSync (instead of loadAsync) during rendering;
  2. loadSync calls requireSync for the remote component.
  3. requireSync calls __webpack_require('webpack/container/remote/{mfName}/{compoName}')
  4. webpack_require throws call of undefined error, because webpack_modules does not have remote module, instead this module has local name aka "./src/components/{compoName}/index.jsx".

Sad thing is, WMF already loaded all the modules. So its the question to @ScriptedAlchemy. Is it possible to synchronously resolve loaded external modules somehow?

I tried manually putting the local component name. It doesn't crash in that require, but then attempts to require shared modules and crashes again: __webpack_require__(/! react / "webpack/sharing/consume/default/react/react");

IMalyugin commented 2 years ago

If it is impossible to make WMF modules available synchronously for requireSync consumption, we can modify CSR in loadable to use requireAsync for all the remotes. Then modify loadableReady to wait for chunks to be both pushed in LOADABLE_LOADED_CHUNKS object and actually resolved within requireAsync.

I already tried replacing loadSync with loadAsync for remotes. This way they load up properly, but CSR first renders them empty, then resolves and rerenders.

We also probably need to do the same for SSR.

Although, I believe we would benefit from fully synchronous federated module resolution. For example, a lot of systems use bare v8 for SSR. There is no event loop in that thing, so even a single instantly resolved promise would render SSR useless.

ScriptedAlchemy commented 2 years ago

Bare v8 like cloudflare or some other technology. One key note on those. Depending how you load code, code generation is not supported on edge vendors šŸ˜‘ the nearest thing to sync I can manage is replace webpack filesystem. Rebuilt target async node it's still async but at the Fs level, not the framework level

IMalyugin commented 2 years ago

Canā€™t we have a callback replacement for chuck loading? loadable is actually using such. Chunk is added during chunk code evaluation, which is calling loadable callback. So the wmf init/get can use simillar mechanic. Once you call ā€˜getā€™ if chunk is already loaded, it is returned synchronously

ScriptedAlchemy commented 2 years ago

I probably wouldnā€™t deviate from the way we have WMF setup internally - since it already does return a sync function back when using require or static imports. But we can make a proxy module on the get,init call - essentially MF middleware. Though webpack internally expects to receive a thennable, as it calls factory and get for us - but we can proxy call it, and upon returning the real get(), tell loadable something FAFC8691-D59A-40F6-8706-5ED3791A422E

ScriptedAlchemy commented 2 years ago

You can also replace the chunk load global inside webpack. Its a global variable like webpack_chunk_load Similar to how you can change public path at runtime

IMalyugin commented 2 years ago

I probably wouldnā€™t deviate from the way we have WMF setup internally - since it already does return a sync function back when using require or static imports. But we can make a proxy module on the get,init call - essentially MF middleware. Though webpack internally expects to receive a thennable, as it calls factory and get for us - but we can proxy call it, and upon returning the real get(), tell loadable something

Well, initially we had two paths to fix the issue:

  1. WMF side: allow remotes to run synchronously if their chunks are loaded externally;
  2. Loadable side: tweak loadable to make up with asynchronous nature of mwf;

And now this leaves us with 2nd choice, we can make a deeper dig into the issue from loadable perspective.

So, how do I understand it, from the source code analysis: The issue is taking root in the way loadable expects modules to be loaded once chunk loads. Thus loadableReady assumes that all those chunks are loaded and simply forces their require. What we should do instead, is make loadableReady not only wait for chunks to load, but also to perform async initialisation and wait for the promise, returned by the remote.get's. Basically we need to make loadableReady do the same as webpack's bootstrap. Doesn't seem all that hard.

I'll see if I can hack it

IMalyugin commented 2 years ago

You can also replace the chunk load global inside webpack. Its a global variable like webpack_chunk_load Similar to how you can change public path at runtime

Don't really understand what you mean by that, or where that "chunk load global" is in the big picture :)

Unfortunately, the issue is not about determining "when are modules ready", as If i set 10 seconds timeout and try to call requireSync - it will crash as well, because modules aren't getting loaded, till the rendering starts.

IMalyugin commented 2 years ago

I believe I found a working approach, going to compile a solution soon. Not sure if it's a right place for it, but during loadable constructor calls (aka loadable(...)), it's possible to collect a global object with [chunkName]: importAsync calls. Then loadableReady will wait for Promise.all([...]), for each external chunk listed in __LOADABLE_REQUIRED_CHUNKS___ext.namedChunks

IMalyugin commented 2 years ago

Okay, got it working both server- and client-side.

Got a question on which strategy do we take server-side? Choice A: Pass extra option to babel-loader to inject static import (ssr: true), so that entire loading is handled by webpack bootstrap Choice B: Make loadableReady call required on the server. And then use same resolution strategy as the web, except it also requires a reverse-reverse lookup. For some reason I need to preload same components from different remotes.

IMalyugin commented 2 years ago

@theKashey @ScriptedAlchemy Uploaded a PR with loadableReady turned into an async bootstrap.

A lot of things still need to be tested, but it works in my setup. Also I did not actually "build" loadable yet. Just modified it inside node_modules. Then accurately copied the solution to the repository. Will soon check if the built version is working properly.

I also didn't manage to collect all the needed chunks for wmf bootstrap + shared + loadable. Aka if I wrap ./bootstrap with loadable it throws on shared not available for eager consumption. If I don't wrap it, bootstrap chunk is not registered with loadable. That has been reported before https://github.com/gregberge/loadable-components/issues/690

theKashey commented 2 years ago

There is a little unknown moment around the way loadable names chunks - it has to be deterministic and the names has to be unique.

885 seems to be capable resolving the root cause, but:

IMalyugin commented 2 years ago

There is a little unknown moment around the way loadable names chunks - it has to be deterministic and the names has to be unique. https://github.com/gregberge/loadable-components/pull/885 seems to be capable resolving the root cause, but:

That can be listed in caution for wmf. chunks should probably include [remoteName] section in webpack. We can also try binding resolution to manifest files generated by webpack, if the problem arises. Iā€™ll try production builds, to see if the mapping is done right. Added to todo list.

  • federated chunks will not be flushed during SSR, and I am not sure if it's even possible as due to WMF nature those names are not "static"

Don't see a problem here, we render the them all together, so they should be properly flushed.

  • all federated modules will be loaded, regardless are used or not, leading to undesired results.

I only request chunks gathered during SSR and passed to named chunks, so it shouldn't be a problem. Worth checking if that is true thou. Added to todo list.

IMalyugin commented 2 years ago

Lol, found the problem with shared modules, its kinda laughable. I declared React as shared in WMF. And loadable is using react internally. So It jamms into chicken/egg paradox.

WMF requires bootstrap to load eager dependencies, such as React. And loadable must wrap bootstrap, to gather chunks, which requires react. On top of that, loadable is only meant to be used inside components, so webpack bootstrap pretty much breaks the concept.

--- edit

So far, I don't see how loadable can consume WMF`s import('./bootstrap'). That could become a show-stopper. If that's what you mean under not working federated chunks flushing, then I pretty much understand it now

IMalyugin commented 2 years ago

We currently have two blockers, each with it's own proposed solution:

Problem 1. WMF eager static imports are generating chunks that aren't flushed by loadable.

Solution: Webpack uses async import boundaries to take its time and preload all the eager static imports. Meaning the dependencies are kept somewhere. WMF turns them into Promise.all like this:

image

We can extract them with webpack-plugin somehow and attach as dependencies to their async boundaries chunks. @ScriptedAlchemy can probably point us in the right direction - how to properly pinpoint eager static imports binding.

Maybe we could patch webpack to add eager import dependencies to compilation.stats? If it is not already possible by mapping chunk children to origin[].moduleId starting with "webpack/sharing/consume", like:

image

Problem 2. Top level async boundary - bootstrap is not flushed. Loadable is meant to process react components, there were no need to support regular imports. So we can't just loadable(() => import('./bootstrap')) if there is no react tree rendering, we can't extract the chunk.

Solution: basically "bootstrap" or top level async boundary is a synchronous pattern, there should be no "conditional" rendering to it compared to loadable components, thus we can just tell loadable to treat that chunk as one of those always required dependencies. To do so, I suggest introducing loadable.bootstrap(() => import('./bootstrap'))


So far all of this seems doable.

--- edit 1

Spent whole day, trying to build some dependency graph for shared eager module from compilation.stats.chunk.origins, then from raw compilation.stats.modules, but I believe that is impossible, unless, well parsing module "webpack/sharing/consume/**" nested reason->children fields to find dependencies.

ScriptedAlchemy commented 2 years ago

I parse the MF plugin options, so i know everything about that plugin and if its shared or not.

I track what's loaded per request, so i dont load anything additional other than the execution tree. Preloading only 3-4 files that were actually used.

For vendors, in webpack they have a shared module type, so i can look at the stats of graph and find all the ids, chunks they live in, how its referenced.

This is a large portion of the magic that makes it work in next.js - might be some applications here, at least conceptually. The module id is a placeholder id, so you have to look past that key and know what's the chunk associated with it.

Async import bootstrap also solved many issues, but you'd still want to flush chunks out of SSR for maximum performance

https://gist.github.com/ScriptedAlchemy/7c1c7b25665524fbb0dfb4f06db7ebff

ScriptedAlchemy commented 2 years ago

These shared modules, you'd want to know what came from where. I do this by tracking the initialization scopes which include what, and from who.

IMalyugin commented 2 years ago

That's actually great news, meaning I'm on the right track. Or even on the right paved road!

I parse the MF plugin options, so i know everything about that plugin and if its shared or not.

Got that part from webpack stats as well, it has all the info about consumes and provides

I track what's loaded per request, so i dont load anything additional other than the execution tree. Preloading only 3-4 files that were actually used.

That's the magic done by loadable, don't really plan to take that bread, loadable tracks rendered components very well, only need to figure out how to let it track async initialization.

For vendors, in webpack they have a shared module type, so i can look at the stats of graph and find all the ids, chunks they live in, how its referenced.

That's what I spent whole yesterday doing, and only by the end of the day discovered there is a "reason" section inside webpack/sharing/consume "modules", that tells who uses what.

This is a large portion of the magic that makes it work in next.js - might be some applications here, at least conceptually. The module id is a placeholder id, so you have to look past that key and know what's the chunk associated with it.

Yep, that's the info I also got from the webpack.stats

ScriptedAlchemy commented 2 years ago

Parsing reasons is pretty common.

Another thing to watch for is if it's shared code, you want to put it in the right parents in the map. So like react, I'd push its chunk into the array of files needed at entry point level. Otherwise things could double load.

IMalyugin commented 2 years ago

Parsing reasons is pretty common.

Another thing to watch for is if it's shared code, you want to put it in the right parents in the map. So like react, I'd push its chunk into the array of files needed at entry point level. Otherwise things could double load.

Did not reach that part yet.

I planned to extract shared provides and shared consumes by chunks, and put them into loadable-stats. After that find a way to require shared without duplication. That approach will also simplify mf orchestration, since I can join multiple loadable-stats and dedupe shared even further.

IMalyugin commented 2 years ago

Okay, been off the radar for a week doing other things, but now I'm back.

Finished working on Problem 1.: extracting WMF shared consumes and attaching them to chunks. Now chunks section is extended with sharedConsumes:

image

That in turn are listed in sharedModules section. All the data is extracted purely from stats.json via loadable/webpack-plugin

image

(Updated PR: https://github.com/gregberge/loadable-components/pull/885)

The next step is Problem 2: Telling loadable to extract async modules between entrypoint and App. As soon as I started thinking about that problem I've hit a multiple choice decision.

Prerequisites:

  1. During chunk flushing, we tell wmf to use an entrypoint;
  2. Chunks connected to react components wrapped by loadable are gathered during Server-Side Rendering;
  3. Between 1 and 2 there is an async boundary: export default import('./bootstrap'), needed for WMF to collect eager depencendies;

So, there are at least 3 possible ways to proceed:


Choice 1: Using webpack stats, I can find all the async chunks between entrypoint and every loadable. Assuming the application only has one path from entrypoint, that would allow gathering these async chunks automatically with no extra boiler code; Problems: what happens if there are more than one conditional path between? And what to consider the application root? Possible Solutions: We could hypothetically throw a warning that there is more than one possible import path, so extra chunks are flushed and fix. The first chunk registered by ChunkExtractor can be considered the root chunk.


Choice 2: I could add loadable.bootstrap(() => import('./bootstrap')), this way we know exactly which path to take, or do we? Problems: client and server usually have different entry points. So the bootstrap call flushed on the server will not match the call on the client. Possible Solutions: It's ugly, but we could attach a key argument for bootstrap call, in case they don`t match (and if there is more than one bootstrap in the entrypoint).


Choice 3: We could have a loadable wrapper at application root, this way we would definitely know which chunk contains the root of application. This solution does not involve much changing to loadable mechanics, since ChunkExtractor can already work well with react components Problems: This approach will not work if there are multiple async boundaries between entrypoint and application root. Possible Solutions: Restrict making multiple boundaries? Combine with solutions for 1/2?


I donā€™t like choice 1 as it becomes uncontrollable and unpredictable. Choice 3 has a rather bad limitation.

So I favor the 2nd choice. We could also combine 2 and 1. Aka use bootstrap to tie down entrypoint with application root. But build the path from entrypoint to the root via webpack stats, following available client-bundle bootstraps.

IMalyugin commented 2 years ago

Did some analyzing of @loadable/babel-plugin contents and to be honest, I'm lost. From the way it is currently implemented, I am not sure what needs to be done next.

If we agree choice 2 is is the optimal way, we just need to add new loadable.bootstrap function. That function would have to be transformed towards the same path loadable components operate. Thing is, I don't need to actually change the async chunk with a component, not even sure I need to meddle with the chunk name. But I do need to somehow mark that import or bind it to the the react app part. And that binding has to work even if there are no loadable components on the other side.

Tinkering with babel-plugin AST is complicated enough as it is, not to go around experimenting, trying to figure out what has to be done.

One more thing: I don't want to turn loadable-components into imported-components, so these async boundary chunks must be applied during render, not import. Or as other choices state, we can use static chunk tree analysis instead.

IMalyugin commented 2 years ago

So, there are at least 3 possible ways to proceed:

Choice 1: Using webpack stats, I can find all the async chunks between entrypoint and every loadable. Assuming the application only has one path from entrypoint, that would allow gathering these async chunks automatically with no extra boiler code; Problems: what happens if there are more than one conditional path between? And what to consider the application root? Possible Solutions: We could hypothetically throw a warning that there is more than one possible import path, so extra chunks are flushed and fix. The first chunk registered by ChunkExtractor can be considered the root chunk.

Choice 2: I could add loadable.bootstrap(() => import('./bootstrap')), this way we know exactly which path to take, or do we? Problems: client and server usually have different entry points. So the bootstrap call flushed on the server will not match the call on the client. Possible Solutions: It's ugly, but we could attach a key argument for bootstrap call, in case they don`t match (and if there is more than one bootstrap in the entrypoint).

Choice 3: We could have a loadable wrapper at application root, this way we would definitely know which chunk contains the root of application. This solution does not involve much changing to loadable mechanics, since ChunkExtractor can already work well with react components Problems: This approach will not work if there are multiple async boundaries between entrypoint and application root. Possible Solutions: Restrict making multiple boundaries? Combine with solutions for 1/2?

I donā€™t like choice 1 as it becomes uncontrollable and unpredictable. Choice 3 has a rather bad limitation.

So I favor the 2nd choice. We could also combine 2 and 1. Aka use bootstrap to tie down entrypoint with application root. But build the path from entrypoint to the root via webpack stats, following available client-bundle bootstraps.

Covering edge cases, I got the following results:

  1. Choice 1. Will NOT work. If we have an app that does not consume any loadable ssr components, but still use wmf shared and async boundary, we will not be able to determine find all the async boundaries between App and entry, because we would have no loadable calls within App;
  2. The above is handled by choice 3 properly, but there is still the limitations of multiple async points. For now the only solution I see is choice 2. Naming async boundaries would both let us support complex cases such as different bootstrap path for client and server and give toolset for manual manipulation.

Going deeper into the current goals for choice 2 solution:

  1. Our main goal is to capture the imported chunk names for all the bootstrap calls;
  2. To capture it during runtime and not compile time, we can proxy import's thenable. When 'then' is called, we add the chunk to the ChunkExtractor;
  3. Proxied then call happens between ChunkExtractor initialization and collectChunks call;
  4. We might require multiple bootstrap calls for client for every bootstrap call on the server and vice versa;
  5. Loadable keeps deterministic naming for all the chunks, in our case, we should give bootstraps special names based on both given key and path. If key exists, we'll be matching part of the chunkName representing the key, otherwise, we check against the path;

So from what I see, we can do something like: LoadableBootstrap__src_components_index_jsx for path-related keys and "LoadableBootstrap_some_key__src_components_index_jsx" for keys. Alternatively, I'm not sure if we need to rename chunks at all. Why can't we just gather them during thenable call with their given names/keys. It's not like we actually need to do the loadable magic for sync/async call later on, they are only needed for the flushing.

I'll spend some more time clearing out why's and how's chunkName replacement works in loadable/babel-plugin and then try to implement that bootstrap utility.

IMalyugin commented 2 years ago

Got some great news:

  1. I was able to implement the loadable.bootstrap function. Funny enough the function contains but a warning that it should not be called without babel plugin. Babel plugin replaces the call entirely with an iife that returns thenable. Once that thenable is resolved it both pushes the bootstrap chunk to global object stack and calls the async import; Thenable allows us to create onDemand bootstrap. It will only be registered if it is resolved.
  2. Having loadable.bootstrap completely replaced fixes the "chicken-and-egg" problem, initial chunk no longer depends on loadable nor react, so we can separate loadable and react into WMF shared dependencies entirely;
  3. I did some analysis on babel-plugin chunk name generation and bumped into the issue which apparently was already raised before: https://github.com/gregberge/loadable-components/issues/549. Posted a simple solution that will fix the collision, also with some work the very same solution may fix collision between different mfs;

What's left to do:

  1. The working loadable.bootstrap tells that a bootstrap chunk is loaded, but there are still no ways to merge them, since there can be multiple bootstrap chunks SSR and CSR. Furthermore, we can't call these bootstraps passing same variable and expect them to match, unless we just pass string literal with bootstrap name;
  2. ChunkExtractor will get an update supporting promise of react-component. This will allow us to flush and rewind bootstrap chunks between renders. There is also a large work related to user-error solving, aka if instead of passing promise to chunkExtractor, developer resolves it prematurely and then passes down the component, registered bootstrap chunks should throw an error;
  3. ChunkExtractor should also learn working with sharedModules that loadable config now provides, replicating WMF shared;
  4. The naming issue above requires resolving;
  5. Need to enhance loadable SSR toolkit to allow joining multiple loadable-stats into a single united stats, so that chunk mechanics works out of the box without external scripts and plugins;
  6. Documentation, optimization, tests, examples, etc.

--- edit 1 Connecting bootstrap from SSR with CSR is not as simple as I thought. For now, I only see the following solutions: a. Set predefined name e.g.: "bootstrap-loadable" for bootstrap chunk through babel-plugin. For more complicated cases, allow manually adding webpackChunkName, so that multiple bootstraps see each other. b. By default assume that both server and client use the same bootstrap call, if that is not true, force webpackChunkName usage; c. This is rather complicated, but instead of using webpackChunkName, we could add a parameter to the import, there is no problem with parsing string literal, but if that parameter is imported from another module, we could try using static analysis;

Solution b. seems most obvious, as it does not obstruct the normal chunk generation flow and uses same mechanics applied by other parts of loadable. Guess I'll start from here.

--- edit 2 The scope keeps getting bigger every time I do something. Figured out that If I gather bootstrap chunks by calling global function, that would create conflict if multiple SSR are running in the same scope, they could potentially interfere with one another. To fix that, I'd have to use node async context, and if it is not supported by the environment, create a mutex (only allow the async execution to complete one at a time.

IMalyugin commented 2 years ago

Finally finished the async bootstrap chunk registration, It was harder than I thought and required me to do a little hack with unused export const __LOADABLE_MARKER_BOOTSTRAP__${chunkName} = true; marker. I decided to keep things simple: no dynamic bootstrap chunk, if you call loadable.bootstrap(() => import('./path/to/bootstrap')), that chunk will always be bundled with the chunk that imported it.

Now switched to enhancing ChunkExtractor, to make it properly use loadable stats for registering both bootstrap- and wmf shared chunks.


@theKashey ever thought about rewriting loadable-stats, so that it only contained fields required by loadable? We could reduce the load on ChunkExtractor and remove a lot of unnecessary data in the stats file without breaking backwards compatibility (except for those, who process stats manually outside ChunkExtractor). If somebody needs some extra data in loadable-stats, they should opt-in for them via configuration.

Also quite literally holding myself back from rewriting ChunkExtractor towards respect for SingleResponsibility.

theKashey commented 2 years ago

ever thought about rewriting loadable-stats, so that it only contained fields required by loadable?

I also hold the same idea for a long time - https://github.com/gregberge/loadable-components/pull/778#issuecomment-849165455

It was "reduced" multiple times:

And increased as well:

But we are subject for hurim law and should just create another version of Chunk Extractor with the another "API"/"Goal" set initially ( well, https://github.com/theKashey/webpack-imported )


finished the async bootstrap chunk registration, It was harder than I thought

And probably the same can and should be applied to the loadable-component themselves. I don't feel that the current model is actually clicking with Federation and one might need to find another model. For example create that new ChunkExtractor and make it a part of ModuleFederation - ie let servers share this information, not to make it somehow discoverable on the frontend.

No changes to the MF is required. "Loadable" should just discover federated modules(you did it) and then discover "Loadable" inside them. Repeat until success.

IMalyugin commented 2 years ago

But we are subject for hurim law and should just create another version of Chunk Extractor with the another "API"/"Goal" set initially ( well, https://github.com/theKashey/webpack-imported )

It feels overwhelmingly wrong to create another plugin, just to allow loadable to simplify stats. It also feels wrong to create 2 stats files by loadable. So what I had in mind, was major release, with a new flag that allows for minimum stats, eventually switching that flag defaulting to minimal stats.

For now, It feels I'm violating pioneers rule, by adding 2 new methods to already overbloated ChunkExtractor class, I'm leaving something in worse state then it was. Guess that's what open source is like, your every step feels like walking on a mine field :)

There is no problem with current loadable-stats config in terms of data sufficiency. But we could optimize a lot of things on the way. Welp, guess not this iteration.

theKashey commented 2 years ago

It feels overwhelmingly wrong to create another plugin

Unfortunately, stats generated by loadable as over and misused already. Releasing a major version of plugin and introducing the breaking change in the API might break some integrations.

So this can be done only with recommended replacement, a clear migration guide, and the minimal API surface from day 1, as switching that flag defaulting to minimal stats will introduce another breaking change.


Thus I want to ask you - what exactly you need to change? What if in any circumstances the best way forward is creating a special MF-friendly version of ChunkExtractor and Loadable.bootstrap code? Do anyone not using MF will benefit from the change or it's actually better to keep things detangled?

IMalyugin commented 2 years ago

Let's move that discussion to the moment when initial implementation is done in pull request. Need to focus on the task first, then we'll see if some suggestions are worth doing.

As for your questions:

  1. bootstrap will be usable for people not using mf, as it allows binding multiple asynchronous chunks together, some task may require this;
  2. Cleaning up loadable stats size would allow us to reduce size by 2 to 4 times, could be useful to others too;
IMalyugin commented 2 years ago

Did not abandon the project :) Just got my hands busy elsewhere. Got a bit of a progress before I had to switch. Both positive and negative.

First for the bad parts:

  1. Tried out production build, as all the moduleIds got replaced with numeric values, some of the parts stopped working, especially all the checks against federated id's; rewrote a couple of checks to another field (aka identifier), but one of the checks, "isFederated" could not be rewritten. It happens during runtime, but we do not have any info on other ids at that time. So unfortunately, to detect federated chunks, the schema has to be even more complicated: got to collect all the federated chunks during webpack build and output them into the client;
  2. Working with shared wmf dependencies also proved more difficult, than I thought. Instead of blindly collecting all the related chunks, I found out that inside webpack stats chunks: [...] arrays within modules represent all the chunks that include a copy of the module. So now we have to thoroughly analyze which chunks are already accumulated and ignore them. To add another nail, the common chunks must actually match the ones added during server build;
  3. Since we need to match some "unknown" chunk inclusion algorithm, we need to render single entrypoint on the client (without extra preloads), see what WMF includes on it's own. Save that order and then compare it against the order we have with LOADABLE_REQUIRED scripts.
  4. Having this in mind, I decided to go for TDD: added jest-puppeteer, configured it out a bit, ran into a trouble with conflicting configurations, started resolving against jsdom that was already used, till I figured out that jsdom is a better fitting solution for the task. And it actually works well with WMF client-side, I got the scripts that WMF loads, just need to clean up my prototype and turn it into an example inside loadable repository;
  5. And the final thumb down goes with remoteEntry.js, analyzing webpack stats, shared dependencies and files WMF loads, It seems remoteEntry.js is out of all those pictures, so we probably have to require it manually just before requiring any of the remote chunks;

To sumarize, the goal yet again got pushed away by a few weeks of development :)

bannik commented 2 years ago

Any progress on this? Seems to break razzle builds (hash mismatch on production build)

Unfortunately, I just spent half of my day realizing that this happens when you use webpack 5 or above. Once you downgrade back to webpack 4 it functions normally.

fivethreeo commented 2 years ago

Razzle is quite unstable with wp5, needs help with rewrite btw ;)

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

ScriptedAlchemy commented 2 years ago

How I got this to work for next.js was to embed the loadable stats into the remote entryā€™s runtime. So I have get,init,chunks - then at runtime Iā€™ll collect the chunks from each remote that itā€™s exporting, then merge them with the known loadable manifest.

For module ids. I didnā€™t have to worry about that, they were registered as the ids from the other container and merging the manifests at runtime provided the lookups I needed.

Ontop of that, Iā€™ve also changed how remote module is loaded, so when a remote modules get or init apis are called, Iā€™ve got hooks into that and can see ā€œwho loaded whatā€ so as itā€™s executing webpack is pushing requests into a queue that I can flush after render by getting the remotes chunk maps with global.remote.chunkMap

IMalyugin commented 2 years ago

The hook "who loaded what" seems like some new webpack event? Gave this whole thing a little thought and what's really holding the implementaion back - is WMF awesome async boundary, that just loads whatever it needs (hardcoded as Promise.all in resulting bundle). If that thing could be exposed, it would make things much easier.

Also I don't see how that could be done during SSR (executing webpack), as the eager chunks are not the same for csr and ssr.