overlookmotel / livepack

Serialize live running code to Javascript
MIT License
45 stars 1 forks source link

Allow deferring import and evaluation to runtime #131

Open overlookmotel opened 3 years ago

overlookmotel commented 3 years ago

The problem

Currently Livepack is not really useable for client-side code.

The main issue is client-side libraries use of globals.

This comes in 2 varieties:

  1. Browser-only globals e.g. window, document
  2. Javascript globals which may or may not be supported in the browser e.g. Symbol

For example, some libraries contain top-level code like:

const isBrowser = typeof window !== 'undefined';

React contains the following top-level code:

if ( typeof Symbol === 'function' && Symbol.for ) { /* ... */ }

And React DOM includes top-level code:

var canUseDOM = !!(
  typeof window !== 'undefined'
  && typeof window.document !== 'undefined'
  && typeof window.document.createElement !== 'undefined'
);

var skipSelectionChangeEvent = canUseDOM && 'documentMode' in document
  && document.documentMode <= 11;
var PossiblyWeakMap$1 = typeof WeakMap === 'function' ? WeakMap : Map;

If this code is run on the server before serializing:

  1. window and document do not exist so canUseDOM is false.
  2. WeakMap and Symbol do exist on Node, but may not in the browser.

When bundled by Livepack, based on the evaluation of code as it was on the server, the app may not run correctly in the browser.

An additional problem is that Livepack's output from serializing ReactDOM is about 50% larger than ReactDOM's original code.

Ideal solution

Livepack could identify globals in code. Such values would be considered "uncertain". Any other values which were calculated based on uncertain values would also be considered uncertain. Any "uncertain" values would not be serialized as is, but instead the code which created them would be included in the output to run at runtime. All other values would be serialized as usual.

This would be very complicated to implement. Tracing "uncertainty" throughout program flow would be complex. There's also the problem of dealing with the case where the code run to produce "uncertain" values has side-effects.

I'm not sure this problem is entirely solvable. Prepack got tripped up on this kind of problem (see this comment).

If it is possible, this would be the best solution as it would require no intervention from the user, and would maintain the advantages of Livepack's code splitting and tree-shaking.

Simpler solution

A solution which is less ergonomic but more easily implemented in the short term is to allow the user to flag some values to be imported or evaluated at run time.

Three potential APIs:

deferImport()

import React from 'react';
import { deferImport } from 'livepack';
deferImport( React, 'react' );

export default function App() {
  return React.createElement( 'div', {}, 'Hello!' );
}

deferImport( React, 'react' ) would cause Livepack to serialize React as an import statement rather than serializing it as a value in the usual way.

// Output
import React from 'react';
export default function App() {
  return React.createElement( 'div', {}, 'Hello!' );
}

A 3rd argument would specify the import name:

// Input
import { useState } from 'react';
import { deferImport } from 'livepack';
deferImport( useState, 'react', 'useState' );

export default function useName() {
  const [name] = useState('Burt');
  return name;
}
// Output
import { useState } from 'react';
export default function useName() {
  const [name] = useState('Burt');
  return name;
}

defer()

defer() runs a function in Node at build time, but Livepack will serialize the returned value as the function provided immediately executed.

// Input
import { defer } from 'livepack';
const obj = defer( () => ( { x: 1 } ) );
// obj === { x: 1 }
export default obj;
// Output
export default ( () => ( { x: 1 } ) )();

defer() could be used for React's static methods which are typically used at top level e.g. React.createContext().

React.createContext( { count: 0 } ) returns an object:

{
  $$typeof: Symbol(react.context),
  _calculateChangedBits: null,
  _currentValue: { count: 0 },
  _currentValue2: { count: 0 },
  _threadCount: 0,
  Provider: { '$$typeof': Symbol(react.provider), _context: [Circular *1] },
  Consumer: [Circular *1]
}

If React is being imported at runtime, this object will not work with the imported React. The Symbols in the object may not equal the Symbols used in the imported React.

defer() would solve this by running createContext() at runtime:

// Input
import React from 'react';
import { defer, deferImport } from 'livepack';
deferImport( React, 'react' )

const context = defer( () => React.createContext( { count: 0 } ) );
// context === { $$typeof: Symbol(react.context), ... }
export default context;
// Output
import React from 'react';
export default ( () => React.createContext( { count: 0 } ) )();

You could create a re-usable React wrapper which is evaluated at runtime:

import React from 'react';
import { defer, deferImport } from 'livepack';

deferImport( React, 'react' );

const createContext = (...args) => defer( () => React.createContext(...args) );
const lazy = (...args) => defer( () => React.lazy(...args) );
// ... etc ...

const { useState, useEffect } = React;
deferImport( useState, 'react', 'useState' );
deferImport( useEffect, 'react', 'useEffect' );
// ... etc ...

export { createContext, lazy, useState, useEffect, /* ... etc ... */ };

NB useState and useEffect do not need to be deferred as they're only used inside functions, not top level.

deferred()

Sugar for defer() where value being deferred is a function. These two are equivalent:

const createElement = (...args) => defer( () => React.createElement(...args) );
const createElement = deferred( React.createElement );

deferred( fn ) would defer evaluation of values returned by the function. Implementation:

function deferred( fn ) {
  return (...args) => defer( () => fn(...args) );
}

Specify deferred imports at serialization time

serialize( val, { deferModules: ['react'] } )

Livepack would need to hook require() to record the values returned for all calls to require(). When serialize() is called with deferModules option, Livepack would look up the value that require('react') returned, and serialize it as import React from 'react'.

In most cases, you'd also want the whole tree of React's object properties crawled and also serialized as imports. So const c = require('react').createElement is serialized as import react from 'react'; const c = react.createElement.

This is a bit trickier to implement.

Has advantage of allowing user to call serialize() twice with different options to create (1) client-side build with deferred import/evaluation and (2) self-contained server-side build (for server-side rendering) without deferred evaluation. Deferred evaluation is only required for client-side builds, as runtime and build-time environments should be the same for server-side code.

Bunding externals

As described above, the Livepack build would include "externals". The bundle would no longer include all code from the app - there are import statements referring to node_modules. So Livepack's output would need bundling with e.g. Snowpack / Webpack / ESBuild to produce a final self-contained bundle.

That's probably OK for starters, but it'd be better if Livepack traced the dependencies of any deferred modules and added them to the bundle too. This is into the realm of "normal" bundlers though, so no doubt quite a bit of effort.

overlookmotel commented 3 years ago

Actually, bundling externals would not be too hard to implement.

For deferImport( foo, 'foo' ), during serialization of foo:

The runtime mentioned above would execute function created above from file and return module.exports. This is ESBuild's runtime (see here):

const __commonJS = (callback, module) => () => {
  if (!module) {
    module = {exports: {}};
    callback(module.exports, module);
  }
  return module.exports;
};

Then:

// `wrapped` is function created by from file code
const requireFoo = __commonJS( wrapped );

By using function scopes, Livepack's existing mechanisms for dealing with circularity can be utilized to deal with circular requires. Only tricky part is that any assignments, to inject circular values into scope, would need to be executed before require...() function is called. Usually assignments would go at bottom of the output, but this wouldn't work in this case.

An option to serialize() could specify a resolver function so can use e.g. browser field in package.json for resolutions in a client-side build (e.g. Axios uses this).

overlookmotel commented 3 years ago

Alternative solution - replace at serialize time

// src/index.js
import { createElement } from 'react';
export default function App() {
  return createElement( 'div', {}, 'Hello!' );
}

// build.js
import { serialize, deferredImport } from 'livepack';
import React from 'react';
import App from './src/index.js';

const deferredReact = deferredImport('react');
const js = serialize( App, {
  replace: [
    [ React, deferredReact ],
    [ React.createElement, deferredReact.createElement ]
  ]
} );

The first replacement tells Livepack to serialize React as import React from 'react' (i.e. import at runtime). The 2nd would serialize createElement as import React from 'react'; React.createElement.

replace tells Livepack to serialize the first value as its replacement.

const o = { x: 1 };
const js = serialize( o, {
  replace: [
    [ o, { y: 2 } ]
  ]
} );
// js === '{y:2}'

deferredImport() records its return value inside Livepack. When Livepack is asked to serialize that value, it recognises it as a deferred import, and serializes it as an import statement.

deferredImport() would return a Proxy which returns further Proxies for property accesses / function calls. So deferredImport('react').createElement produces another Proxy which records how to access this object from the deferred import.

The createContext() / lazy() case can be handled with a shim for React which is imported in app code instead of 'react'.

// src/react-shim.js
import React from 'react';
const { createElement, createContext: _createContext } = React;

const contexts = [];
const createContext = (value) => {
  const context = _createContext(value);
  contexts.push( { context, value } );
  return context;
};

// Deal with `import * as React from './react-shim.js'`
import * as reactShim from './react-shim.js';

function _getReplacements() {
  const deferredReact = deferredImport('react');
  return [
    [ reactShim, deferredReact ],
    [ createElement, deferredReact.createElement ],
    ...contexts.map(
      ( { context, value } ) => [ context, deferredReact.createContext(value) ]
    )
  ];
}

export { createElement, createContext, _getReplacements };

Then serialization:

// build.js
import { serialize, deferredImport } from 'livepack';
import { _getReplacements } from './src/react-shim.js';
import App from './src/index.js';

const js = serialize( App, {
  replace: _getReplacements()
} )

Disadvantages of this approach:

Advantages:

The latter is the really big gain.

defer()

defer() method would no longer be necessary for this use case, but could still be useful for user code where the code to create an object/array is shorter than the serialized version of the object.

e.g. const arrayOfLongStrings = [1, 2, 3].map( n => 'x'.repeat(n * 1000) ) would be better written as const arrayOfLongStrings = defer( () => [1, 2, 3].map( n => 'x'.repeat(n * 1000) ).

Question: Should defer() return a Proxy, same as deferredImport? Then it could handle primitives too:

const isBrowser = defer( () => typeof window !== 'undefined' );

replace option alternative forms

replace option could also have a functional form:

const o = { x: 1 };
const js = serialize( o, {
  replace(v) {
    if (v === o) return {y: 2};
    return v;
  }
} );
// js === '{y:2}'

or be provided as a Map or WeakMap (which is what Livepack will turn it into internally anyway):

const o = { x: 1 };
const replace = new Map();
replace.set( o, {y: 2} );
serialize( o, { replace } );

The WeakMap form might make react-shim more efficient as any contexts which are created but aren't used could be garbage-collected.

Open questions

overlookmotel commented 3 years ago

Question: When bundling a deferred import, should code from require()-ed files within the deferred import be bundled too? Or serialized as usual?

React contains require('object-assign') and ReactDOM also contains require('scheduler'). As it happens, both of these need to be deferred imports too, as they access browser globals. But that might not have been the case.

How about a module like this:

import { last } from 'lodash';
const isBrowser = typeof window !== 'undefined';
export default [ isBrowser, last ];

This module needs to be a deferred import due to use of window. However, lodash is quite possibly used in the rest of the app, and lodash.last could be safely shared between normal app code and the deferred import.

Two options:

Option 1

There are two "realms":

  1. Values that Livepack serializes normally
  2. Deferred imports

The two realms would have no crossover. In the example above, lodash.last would appear in the output twice if it's also used in the main app code.

Option 2

Resolve require() calls in a deferred import to their values, and serialize those values normally. In the example above, lodash.last would only appear in output once - a gain.

This does assume that objects shared between realms are static/stateless. Otherwise, changes made to the object in one realm could clash with the other.