crimethinc / website

Ruby on Rails app that powers crimethinc.com
https://crimethinc.com
Creative Commons Zero v1.0 Universal
100 stars 31 forks source link

Static site generator #1825

Open veganstraightedge opened 3 years ago

veganstraightedge commented 3 years ago

We're still going to run a Rails app for the .com and the CMS.

A static snapshot of the site could serve as a read replica mirror.

This issue is to create a way to create a static version of the site, which could then be hosted just about anywhere.

goncalopereira commented 3 years ago

As a proof of concept I ran a crawler (wget with mirror settings) followed by a script to sanitise the data.

I was able to get a partial static read-only copy of the production website.

For briefness:

pain points for CDN/caching/mirror:

POC AWS http://ctmirror.s3-website.eu-west-2.amazonaws.com/ POC Netlify https://eloquent-swirles-d1e8f8.netlify.app/

veganstraightedge commented 3 years ago

This is awesome! @goncalopereira

Great start! What're the next steps? What are the open questions to consider?

goncalopereira commented 3 years ago

I think a 2nd opinion would be great. I can create a PR with the ongoing scripts, need to figure out the project structure for it.

I think the questions are:

anarcat commented 2 years ago

i don't think crawling is the best way to think about this, because then you have to recrawl everything (or parts? or what? hard to decide!) whenever content changes.

what some dynamic sites do is that they internalize the "crawler", or more accurately, the static generation. each page rendering is actually stored on disk, which doubles as a fast cache which helps for denial of service conditions. i worked on Drupal sites in the past which used the "boost.module" to do this, but that didn't work well to create a static site copy. i think there's something better for drupal now, but that's irrelevant since you don't use drupal. :p (Django similarly can drive static sites too.)

So I guess the question, IMHO, is how to do this caching thing but with Rails as a backend. I frankly have close to zero experience coding in Rails, but a few searches gave me this documentation, where "page cache" certainly looks interesting.

Note that you'd still have to have something that crawls the entire site (maybe? or maybe rails is magic and will do that on its own?) but the difference is that then you have a server-side archive that you can more easily distribute, and that's a trusted copy that you don't necessarily need to refresh all the time. Whenever you post something new, as soon as someone reads it, it gets cached and added to the pile.

This beats recrawling everything all the time...

veganstraightedge commented 2 years ago

Thanks @anarcat.

I agree that an internal static site generator is also a good idea. We already do do a fair bit of caching in Rails land, but that's dependent on have a big Redis server running. And isn't easy to hand off to another person/place hosting a copy of the site.

IMO, a happy medium would be if the Rails CMS generated static files/folders of the site, then shipped it off-site somewhere, as both files/folders ready to serve as a static site and as a gzip/tarball for others to download and mirror, if needed.

Being able to easily spin up a new Rails/Postgres/Redis would be a nice to have too, but not as easy for many people in many situations to run than just a static classic web server.

anarcat commented 2 years ago

On 2022-02-24 14:21:25, Shane Becker wrote:

Thanks @anarcat.

I agree that an internal static site generator is also a good idea. We already do do a fair bit of caching in Rails land, but that's dependent on have a big Redis server running. And isn't easy to hand off to another person/place hosting a copy of the site.

IMO, a happy medium would be if the Rails CMS generated static files/folders of the site, then shipped it off-site somewhere, as both files/folders ready to serve as a static site and as a gzip/tarball for others to download and mirror, if needed.

yeah i think that's what the page cache is supposed to do, but maybe that's what you're already doing in redis?

Being able to easily spin up a new Rails/Postgres/Redis would be a nice to have too, but not as easy for many people in many situations to run than just a static classic web server.

yeah for sure, it's more of a quick disaster recovery for you i guess.