dgp1130 / rules_prerender

A Bazel rule set for prerending HTML pages.
13 stars 0 forks source link

Support / require ES modules #33

Closed dgp1130 closed 1 year ago

dgp1130 commented 3 years ago

ES modules are the future and we should support and/or require their use. The main challenge here is getting this to work in the NodeJS toolchain and well enough that it isn't a burden on application developers.

I starting look at this and spent the last few hours hacking on it but have failed miserably to get anything to work. Currently, rules_nodejs runs Node v12.13.0 by default:

$ bazel run @build_bazel_rules_nodejs//toolchains/node:node_bin -- --version
v12.13.0

This does not support ES modules natively, however it can be enabled with the --experimental-modules option (passed as --node_options=--experimental-modules as templated_args to the nodejs_binary() targets). This is enough to enable the feature, but the tricky part is actually compiling everything into compatible JavaScript.

I tried updating "module": "ES2020" in the tsconfig.json, however it seems that ts_library() just ignores this field and does its own thing. Instead, ts_library() emits an es5_sources and es6_sources output group. es5_sources is the default, however I can force it to use es6_sources via filegroup(output_group = "es6_sources") (as documented here). This grabs the *.mjs files compiled with ES modules. The problem with this is that it only collects the ES6 outputs of the given target, it does not include transitive dependencies, which is basically useless.

I found an es6_consumer example which works around this by manually collecting all the transitive sources. With this, all the files are present and use the .mjs extension necessary for Node to treat them as modules.

We then have to drop --bazel_patch_module_resolver so it uses the real ESM loader, but any imports still fail with a message like:

(node:3415) ExperimentalWarning: The ESM module loader is experimental.
file:///home/dparker/.cache/bazel/_bazel_dparker/3b4ba8c4eec5951318c36cb07368e75b/execroot/rules_prerender/bazel-out/k8-fastbuild/bin/packages/renderer/renderer.mjs:20
import { main } from '../../common/binary.js';
         ^^^^
SyntaxError: The requested module '../../common/binary.js' does not provide an export named 'main'
    at ModuleJob._instantiate (internal/modules/esm/module_job.js:91:21)
    at async ModuleJob.run (internal/modules/esm/module_job.js:106:20)
    at async Loader.import (internal/modules/esm/loader.js:132:24)

My understanding of this is that I'm importing via a .js extension is still interpreted as a CommonJS module and attempts to be converted to ESM. Since it is not a CommonJS module, it doesn't actually export anything and fails. The error message is quite misleading.

Importing without an extension gives a "Cannot find module" error, which makes me think NodeJS isn't resolving the extension for me. The desirable solution would to import with an .mjs extension, but both VSCode and TypeScript reject this.

ERROR: /home/dparker/Source/rules_prerender/packages/renderer/BUILD.bazel:15:1: Compiling TypeScript (prodmode) //packages/renderer:renderer failed (Exit 1)
packages/renderer/renderer.ts:4:22 - error TS2307: Cannot find module '../../common/binary.mjs' or its corresponding type declarations.

4 import { main } from '../../common/binary.mjs';
                       ~~~~~~~~~~~~~~~~~~~~~~~~~

It seems that TypeScript does not have any meaningful support of .mjs as either an input or output file extension atm. This means I'm stuck between Node which wants .mjs and TypeScript which wants no extension or .js. I'm not sure how to resolve this (no pun intended).

I tried taking a step back and going at this via ts_project(), which is a bit simpler build system and closer to the underlying tsc command. If we convert everything to use ts_project(), could that work better? Short answer, not really. To my knowledge there is no way to make tsc output an .mjs extension (and TypeScript doesn't support that anyways as mentioned earlier). This means we're stuck with a .js extension, and the only other way to make Node recognize a file as a Node module is to specific "type": "module" in the package.json. That sounds easy, but setting it in the root package.json has no effect. I get the following error:

(node:14711) ExperimentalWarning: The ESM module loader is experimental.
file:///home/dparker/.cache/bazel/_bazel_dparker/3b4ba8c4eec5951318c36cb07368e75b/execroot/rules_prerender/bazel-out/k8-fastbuild/bin/packages/annotation_extractor/annotation_extractor.js:4
import { main } from '../../common/binary.js';
         ^^^^
SyntaxError: The requested module '../../common/binary.js' does not provide an export named 'main'
    at ModuleJob._instantiate (internal/modules/esm/module_job.js:91:21)
    at async ModuleJob.run (internal/modules/esm/module_job.js:106:20)
    at async Loader.import (internal/modules/esm/loader.js:132:24)

Looking at the generated tree, package.json is included. And even if I explicitly depend on package.json in nodejs_binary() or a copy_to_bin() version, this error message does not change. Using absolute imports with link_workspace_root = True also has no effect.

Looking at the common/binary.js file which throws the error, it is contained inside the package.json file at the artifact root. Based on my reading of the module resolution logic, I believe everything is correct, though I'm not totally sure what realpath resolves to in the context of this particular Bazel execution, but based on the debug paths spit out by Node, the package.json appears to be in the right spot and should be used for all nested .js files.

I'm not sure what's going on here atm. I think the next step would be to try this same directory structure outside of Bazel to make sure it works the way I expect it to.

Prototype: https://github.com/dgp1130/rules_prerender/commits/ref/esm

At this point though, I'm not convinced this is worth the effort. If it's this hard to get my code compatible, every application using rules_prerender would run into the same issues. There's also no clear benefit to doing this right now. I think for now I'm just going to accept defeat and hope the toolchain solves some of these problems for me later on.

dgp1130 commented 1 year ago

After a lot of work I was able to switch @rules_prerender to use native ESM. In fact we actually now require ESM because I don't want to support CommonJS if I don't have to (we'll see how well that holds up if this project ever actually takes off). This was definitely a tricky change to land and encountered a few interesting challenges.

First, I basically have to name all the files *.mts. I didn't actually want to do that, but I couldn't find a way to get tsc to emit *.mjs files without also naming the inputs *.mts. *.mjs files are required because Node only supports that or type: "module" in the package.json. I did include type: "module" in all the relevant package.json files, but this is not sufficient since Bazel will typically execute standalone files without a dependency on any package.json, so we can't rely on that.

Second, I needed to remove all the places where I used Jasmine spies on imports. It is not possible to mutate an imported object, so spies don't work in that context. I had to update the code to use proper dependency injection.

Third, @aspect_rules_jasmine doesn't actually ESM right now (https://github.com/aspect-build/rules_jasmine/issues/33). I sent out https://github.com/aspect-build/rules_jasmine/pull/41 to fix it and made a local patch with the change for the time being. Once that gets merged and released, we should drop this patch.

Fourth, I tried to using .git-blame-ignore-revs to make history a little cleaner given how expansive the ESM migration is, touching basically every TypeScript and BUILD.bazel file. GitHub supposedly supports this, though it doesn't seem to be working right now. Not sure what's going on there?

Fifth, one particular challenge I'm not sure I got right is moduleResolution. I tried moduleResolution: "Node16" which seems to be the mode to make working with *.mts and *.mjs files easier. However, for the life of me I could not get it work. It would break intellisense of rules_prerender in the local workspace and even failed css_bundler when built in an exec configuration. I could bazel build //tools/binaries/css_bundler just fine, but building a target which invoked the bundler as a tool would fail to resolve lightningcss despite the package being present in both cases. --traceResolution seemed to indicate that tsc was finding index.mjs and looking for index.d.mts when only index.d.ts existed. However I don't understand why this worked when built normally? ts_project() uses workers, so introspection becomes very difficult and turning off the worker broke the build in the same manner, even when the bundler was built directly. I really have no idea what's going on here, but it seems to be some overlap of TypeScript + ESM + lightningcss declarations + Bazel + workers. The simplest solution was to just use moduleResolution: "node", but I'm not sure if that's really the right long term solution here.

This is published in 0.0.23.