withastro / roadmap

Ideas, suggestions, and formal RFC proposals for the Astro project.
320 stars 30 forks source link

Incremental Builds #698

Closed natemoo-re closed 1 year ago

natemoo-re commented 1 year ago

Summary

Incremental Build support in Astro aims to significantly speed up repeated runs of astro build. Initial builds will populate some kind of cache, which will allow subsequent builds to bypass Rollup for unchanged trees of the module graph.

Background & Motivation

The original incremental build support proposal is one of our oldest and most highly upvoted issues. Unfortunately, users have not shared many concrete examples in that thread. However, from the extensive performance profiling we've done, we know that Rollup is the main bottleneck in Astro's build process.

Now that our rendering, Markdown, and MDX performance has been optimized about as far as we can take it, now is the time to explore new options. The best option we have is to move as much of our build process out of Rollup as possible, likely with some sort of build cache. That is the goal of introducing incremental builds.

Why now? The reason is straightforward—caching is notoriously difficult to get right. We did not want to take on the additional complexity until our API surface was stable and our internal build process was easier to reason about. Thanks to significant effort from @bluwy and @ematipico in the past few months, we're now in a good place to tackle this.

Goals

Non-Goals

natemoo-re commented 1 year ago

Ideally, this is something that could be solved generically on the Vite / Rollup level so that every framework could benefit from this. I'm really not sure if that's on the table, though, since the ultimate goal is to bypass Vite / Rollup as much as possible. If this was easy to solve incremental builds in a generic way, it would have been done already.

My current sketch for an API is very straightforward from the user's perspective:

// astro.config.mjs
import { defineConfig } from 'astro/config'

export default defineConfig({
  build: { incremental: true },
  experimental: { incremental: true } // until this is stable
})

Unfortunately that's where the simplicity ends. To implement this, we'll likely need to:

natemoo-re commented 1 year ago

Exciting news! I've spent the last month investigating quite a few approaches to this problem and we're ready to move forward with the first phase of our plan.

Pretty immediately, we hit a major problem with the way Content Collections are currently architected. Invalidating a single article would have a waterfall effect that would invalidate the entire collection it belonged to so every page that referenced that collection would need to be rebuilt. We also were able to verify that the size of the module graph was the single biggest contributor to extremely slow builds. This is not particularly surprising, as module graphs have long been identified as the main bottleneck for JS build tools, but it's nice to have confirmation that this holds true for Astro.

Our first step towards incremental builds will be an internal refactor to the way that Content Collections are generated. Instead of treating Content Collection entries as part of the larger module graph, Astro will treat them as individual entrypoints for a separate build process. Not only does this drastically reduce the size of the main module graph, it should allow us to detect and rebuild only the Content Collection entries that change between builds.

[!NOTE] To begin, this refactor will only benefit users that make heavy use of Content Collections. We hope to use this effort to develop internal patterns and primitives that will inform later incremental build improvements. Stay tuned!

natemoo-re commented 1 year ago

Also wanted to share some diagrams that describe how we expect to break this project down.

The current build in Astro 3.x is a single bundle step with a large module graph. Referencing astro:content pulls in every module that exists inside of every collection, leading to a huge module graph that Rollup struggles to process.

current

Phase One of this incremental build project will focus on refactoring Content Collections out to a self-contained build step. Instead of treating astro:content as the entrypoint, the collection items themselves are treated as the entrypoints and astro:content is regenerated after. This keeps the module graph small and efficient, while opening up an opportunity to cache the outputs for unchanged collection items. The rest of the build remains the same.

incremental-one

Phase Two of the incremental build project will build on top of the learnings and patterns established during Phase One. This step will focus on making the main server build more efficient and tracing exactly which pages need to be rebuilt. This will extend the benefits of incremental builds beyond the previously established Content Collections use case. Treating this as a separate phase will allow us to hone our approach before tackling the more generalized solution.

incremental-two

EyePulp commented 1 year ago

@natemoo-re Thanks for the clarity detail and documentation of your approach. I'm eager for the performance improvements.

More selfishly, I'm hopeful this opens the door to selective page renders. Our use case has a lot configuration options within a single astro project, up to and including rendering or not rendering individual pages. We can solve it today by using the dynamic route feature, but something more explicit and declarative would be very welcome, and it feels like incremental builds might offer that.

Regardless, thanks for the work!

natemoo-re commented 1 year ago

Graduating to a full-fledged RFC. https://github.com/withastro/roadmap/pull/763

fparedlo commented 9 months ago

This is amazing, hope it comes in the next release!

heyitsdoodler commented 9 months ago

Is there a branch where work on phase 2 is being conducted?

ImBIOS commented 1 month ago

Wow, Phase 2 feels like a sci-fi!