Elderjs / elderjs

Elder.js is an opinionated static site generator and web framework for Svelte built with SEO in mind.
https://elderguide.com/tech/elderjs/
MIT License
2.11k stars 52 forks source link

Incremental builds #8

Open pbuzdin opened 4 years ago

pbuzdin commented 4 years ago

Hi there!

Does your framework support incremental builds?

nickreese commented 4 years ago

Currently this is not a feature that is built but it could be added as a plugin or added to the core.

Here is an outline of how you’d tackle this problem using hooks whether it be a plugin or via core hooks.

(I’ll continue to think on this an edit this comment if the thinking changes with comments below on why. The goal is that this comment could become a spec for how this could be implemented.)

1 - As a user you’d need to specify the request objects or full routes to rebuild. This could be done by watching the routes folder for changes.

A clever wrapper around the user implementation of starting a build could recursively parse the Svelte ASTs to determine dependencies and watch those files for changes. (I’d consider this out of scope for the core)

The requests or routes that need to be generated could be given as an array on a plugin config.

2 - Once the requests needed to be built are identified a function registered on the “allRequests” hook would need to do two things: Check for !settings.worker and overwrite the allRequests array. This would limit the build to building just the specific pages.

Regardless of how this would be implemented there should be a flag in the elder.config.js (which is used to populate the settings object) should have a flag to let plugins and hooks know that it is an incremental build.

This would prevent plugins like those generating a sitemap from running.

————

Notes:

nickreese commented 4 years ago

Further thoughts. Will edit The parent when I get back to the laptop.

Currently the build Implementation starts a new Elder.js instance, waits for boostrap, then in a separate process fires up new Elder.js instances which have theirallRequests arrays overwritten with the settings.worker set to true.

Possibly the right way to handle this is with the same approach. Allow Elder.js to have the allRequests array overwritten on start and set a settings.isIncremental to true. This would cause the master process of the built to still evenly divide the requests across all cpus.

nickreese commented 4 years ago

In general the hard part about incremental builds isn't supporting incremental builds, it is determining what should be incrementally built.

TL:DR; Elder.js can be configured to support incremental builds, but won't determine which pages to build incrementally.

An Example

A good example of incremental build complexity is a simple blog.

You adjusted a typo in title of your most recent blogpost, no big deal, you just need to rebuild the single blog that was impacted... right?

Well here is where the edge cases that are unique to each site come up:

The big issue is that without a unified data layer to build a graph of dependencies between pages determining what needs to be incrementally built is difficult if not impossible.

Elder.js can support incremental builds as outlined above but determining what pages need to be incrementally rebuilt is outside the scope of what we'll support. As we work towards v1.0.0 we'll sort out a "blessed way" for users to pass in a specific subset of pages to be built.

x4080 commented 4 years ago

@nickreese I found out that (for production) the first build is taking its time and the next builds take fraction of time only, is this not using incremental build ?

Thanks

x4080 commented 4 years ago

I just found out, maybe @pbuzdin means for development process ? I found that when starting

npm run dev:rollup

It will process all components and take quite a bit of time for all components to build (maybe sapper do it faster?) - @nickreese Is it possible that the 'delete public directory content' can be optional, so that we can choose if we wanted to build all components or just use the already built one ?

Thanks

nickreese commented 4 years ago

@x4080 I think what @pbuzdin is asking about isn't the actual rollup process, but the build process.

Rollup is a necessary evil as each time you change your component we need to generate an SSR and Client version of that component. Our rollup config is different than sapper but the way we statically build sites is dramatically different and will always out perform sapper.

When it comes to SSGs "incremental builds" generally refers to just building a subset of the site. So if you have 18,000 pages but you know only 1,200 are impacted, is there a way to just build those 1,200. I'll keep thinking on the best way to support this besides what is outlined above.

x4080 commented 4 years ago

@nickreese Ok I misunderstood then, sorry guys

akvadrako commented 4 years ago

This would be a great feature. One way to do it is make-style – first build a dependency tree of inputs → outputs. If a node in the tree reports that it's inputs (source files, external data and other nodes) have not changed since the last run, just use a cached value. That will allow things like the sitemap plugin to work incrementally too.

kasperkamperman commented 3 years ago

First of all, the concept behind Elder.js makes sense to me, but one thing I fail to understand. In your Lessons From Building a SSG article you position elder.js as a faster build solution then Gatsby. However if you need to rebuild your site for 10K pages completely every time, won't Gatsby not be faster after the first build (supporting incremental builds now)?

If you prefer me to ask this question in another place, please let me know.

myrsmedenstorytel commented 3 years ago

We are currently using incremental builds in production when using ElderJS. I am not completely satisfied with the approach and will probably rewrite it, but this is how we are attacking the problem currently: • The server.js knows that it is in SSG mode and will then listen to pubsub messages • When it receives a pubsub message, we are running a command like PAGE_ID=xx npm run build • The route.js files know about these environment variables passed into the build command and only fetches the pages that should be rebuilt • The built files are pushed to a bucket and the cache is purged for those URLs

What I would like to change is mostly running the npm run build command from within our application and instead replace it with a call to ElderJS build-function. However, I did not succeed with this - the server.js break while doing it. I think the culprit is what process is main and not when it comes to workers.

I also took some inspiration from this thread and will probably let a hook rewrite allRequests to make the code more maintainable instead of having it in the routes.

floer32 commented 3 years ago

First of all, the concept behind Elder.js makes sense to me, but one thing I fail to understand. In your Lessons From Building a SSG article you position elder.js as a faster build solution then Gatsby. However if you need to rebuild your site for 10K pages completely every time, won't Gatsby not be faster after the first build (supporting incremental builds now)?

If you prefer me to ask this question in another place, please let me know.

Maybe it goes without saying, but Elder.js uses Svelte, and Gatsby uses React. That's the top of the decision tree to me -- Svelte or React.

I don't mean to interrupt though, @myrsmedenstorytel 's solution sounds like a great start

kasperkamperman commented 3 years ago

@hangtwenty Thanks, no need to convince me of Svelte, I was just wondering about incremental build ideas and performance because a large part of the "Why Elder.js" was about build performance.

Just diving into Svelte and it's seems a no-brainer compared with React. I have a Vue background, but Svelte is even more to the point. So I think it's great for SSG templates.

floer32 commented 3 years ago

@kasperkamperman sure thing, I partly wanted to leave the comment for posterity / search-engine-passersby

nickreese commented 3 years ago

From a high level the easiest way to build a single page is to overwrite allRequests in the bootstrap hook with the files you want written.

I'm 100% happy to support incremental builds and open to community suggestions/code.

What it is complex

Here is a scenario as to why I don't use it for my businesses:

A person on our data team updates one of the nursing homes on that page saying it went out of business.

Due to the way our site is built this means we need to rebuild the following templates:

Building a system that tracks the dependencies between different pages on a dynamic site is incredibly complex.

That said, I agree for a simple site that is data driven this is a must.

Starting with the Interface

I often find that starting with the interface to a problem helps define how to build it.

What interface would be the easiest for everyone to use?

An intelligent build system as outlined above is just too complex... but a plugin could easily:

  1. watch the file system
  2. read a json file from the file system
  3. or even read from an API to get the pages that need to be built

From there it just has to overwrite the allRequests object on the main process which then divides the work across all of the workers.

I guess it all starts with the interface, how does the user or system know what needs to be rebuilt?