dgp1130 / rules_prerender

A Bazel rule set for prerending HTML pages.
13 stars 0 forks source link

Output multiple resources #1

Closed dgp1130 closed 3 years ago

dgp1130 commented 3 years ago

Currently, prerender_page() and prerender_page_bundled() output a single HTML page (and other included resources). However, users may want to prerender multiple pages from a single prerender file. A common example might be a blogging site, which has a number of blog posts which use mostly equivalent HTML/JS/CSS, but are built from different markdown. Rather than having a prerender_page_bundled() for each blog post, there should be a way to generate all of them from a single rule.

This could work in Node by simply having the default export return an Iterable or AsyncIterable of PrerenderResource objects, where PrerenderResource simply correlates a URL path to its contents. It could look something like:

import { promises as fs } from 'fs';
import { render } from 'markdown-it';
import { PrerenderResource } from 'rules_prerender';

export default async function* (): AsyncIterable<string> {
    const mdFiles = await fs.readdir('some/directory/', { withFileTypes: true });
    for (const file of mdFiles) {
        yield PrerenderResource.of(`/posts/${file}`, render(await fs.readFile(file, { encoding: 'utf8' })));
    }
}

In Starlark, we would probably need another macro, since the output is complete different. A corollary to prerender_page() is pretty straightforward, since we don't have to worry about bundling.

load("@npm//rules_prerender:index.bzl", "prerender_multi_page")

prerender_multi_page(
    name = "posts",
    src = "posts.ts",
    lib_deps = [
        "@npm//markdown-it",
        "@npm//rules_prerender",
    ],
)

# Outputs a `web_resources()` rule with all the prerendered files.

prerender_page_bundled() is a little trickier but would look the same to the user. We would need to get Rollup and PostCSS to bundle each page individually and then inject the resources into each one. Alternatively, we could treat all pages as sharing the same resources (which is likely mostly true in practice, but there would certainly be edge cases) and simply do a single bundle step to generate one JavaScript and another CSS file, then inject just those two into every page. That would likely be easier from a tooling perspective, but would also be less optimal, since one page including a large dep would cause that dep to be included in all pages.

dgp1130 commented 3 years ago

Quick update on progress here: We're currently at the point where prerender_multi_page() is roughly feature complete. Users can return multiple PrerenderResource() objects and each one will be rendered at the corresponding path inside a directory. All these files are processed and their script and style annotations are extracted into a metadata.json file. JS and CSS entry points are generated from the metadata.json file and exported as %{name}_scripts and %{name}_styles. I just published rules_prerender@0.0.3 to NPM, so check it out (changelog)!

For now, all script and style resources in all HTML files generated by prerender_multi_page() are combined together as mentioned in the original comment. This will make bundling much simpler, but means we are not tree shaking as effectively as we should.

prerender_multi_page() only generates unbundled resources, so the directory of HTML files, client-side JS, CSS, and directory of resources are all output independently of one another. This allows users to bundle these resources in their desired manner. prerender_multi_page_bundled() will provide a simpler interface and bundle all these resources together for users who don't care about the precise strategy. That's the next step to completing MVP working for this issue.

While working on this I ran into a mistake I made where I generated two PrerenderResource files with the same URL path. Currently the tool happily writes the file twice and the last one wins, however this could be quite a foot-gun. A simple typo could lead to non-hermetic builds if the two versions of the file are generated independently and properly parallelized. We should add a check to each returned path to make sure it was not already written and fail if it was. Filed #23 to add this safety check.

dgp1130 commented 3 years ago

Was able to make a bit more progress today than expected. I got prerender_multi_page_bundled() mostly working (with the caveat that JS and CSS files are not tree shaken between different generated files). We're able to generate multiple files at once, bundle the JavaScript and CSS, and inject them into the generated pages. //examples/multi_page/... shows off a working example.

I just published rules_prerender@0.0.4 (changelog) and updated the project README with some basic documentation for this. The doc does lie a bit as the data field is not supported yet, so blog use cases aren't totally supported.

A few things still TODO:

I'd also like to try simplifying some of the infrastructure. As of now there are a few tools with multiple versions, a basic version which processes a single HTML file, and then a "multi" version with processes multiple. I'm curious if I can drop the basic version and update the BUILD rules to use the "multi" versions with just a single input. This is a bit awkward for a few reasons, but it may be worth doing if we can drop a lot of the extra complexity.

I'll file separate issues for those points, but I think this is good enough of an MVP to consider this issue closed. 🥳

dgp1130 commented 3 years ago

Some follow up for today: