getzola / zola

A fast static site generator in a single binary with everything built-in. https://www.getzola.org
https://www.getzola.org
MIT License
13.73k stars 954 forks source link

Built in Minify Support #542

Closed imsus closed 3 years ago

imsus commented 5 years ago

Problem

As the title says, currently Zola doesn't have built-in html minifier support. And as for comparison, Hugo has one by invoking:

$ hugo --minify

It will generate production-ready minified HTML (Inline Styles + Inline Scripts) pages.

Idea

By borrowing the idea from Hugo we can copy like:

$ zola build --minify

I'm not familiar with Rust ecosystem. So maybe using https://docs.rs/html-minifier/1.1.3/html_minifier/ is possible.

Keats commented 5 years ago

That's a good idea, although we already have minification for Sass. We can add HTML but not sure about JS/SVG/images, those are things you kind of want to add to do in-place instead of re-doing it everytime you build...

chris-morgan commented 5 years ago

I think it is reasonable to add optional extra roughly-lossless minification of at least HTML, JavaScript, CSS and SVG. For raster images, minification tends to be extremely slow, and I agree that doing them once and committing that is generally the right thing to do—but given image processing which can produce new artefacts, I think it is worthwhile anyway.

This is the point when you start mumbling about make and not throwing away artefacts obtained at great computational expense along with the public directory.

All minifiers will need the possibility of extra configuration; just look at the options the npm html-minifier package has, and I’ve definitely had situations where I’ve had to tweak them, and even tweak what I write to prevent it from ruining overly-clever HTML/CSS.

I don’t suggest using the minifier or html_minifier crate; they’re very rudimentary in what they can do, and not at all mature: they’ll be super fast by comparison, but not as effective as they could be, and they are more likely to break things as well.

A low-quality but very powerful first pass would be handing it off to third-party tools; I imagine something like this:

[[minify-pipeline]]
filename = "**/*.html"
command = "html-minifier --config-file html-minifier.config.js {file}"

[[minify-pipeline]]
filename = "**/*.png"
command = [
  "optipng -quiet -o7 -fix {file}",
  "advpng -q -z --shrink-insane {file}",
]

I thought of making this a general-purpose pipeline and adding an optional key which can be set to true for steps that don’t change the semantics (viz., minification) or false for mandatory processing, but decided that wasn’t desirable due to the impact this whole approach is likely to have on build time, and that you now can’t do anything if you don’t have the third-party tool. Given this, defining it as specifically for minification (or expressed otherwise, production deployment) seems reasonable.

Here’s a possible simpler form:

[minify]
html = "html-minifier --config-file html-minifier.config.js {file}"
png = [
  "optipng -quiet -o7 -fix {file}",
  "advpng -q -z --shrink-insane {file}",
]

But I have concerns with this approach due to potentially significant ordering: e.g. you may want to minify JS, then inline it into HTML, and if the HTML is processed before the JS, then the HTML may contain unminified JS. But the TOML spec says “Key/value pairs within tables are not guaranteed to be in any specific order.”

I have not contemplated exactly how static/processed_images may fit into this scheme—I may want to feed all such images through cjpeg, for example.

Keats commented 5 years ago

Part of what I want with Zola is to not have to use external tools, the tagline being Forget dependencies. Everything you need in one binary.. Sure some people will have needs not covered directly in Zola but the goal is to get there for the other 90%.. I would rather bundle a Rust library for each and allow to configure it in the config file.

This is the point when you start mumbling about make and not throwing away artefacts obtained at great computational expense along with the public directory

You anticipated me :D I think if you don't mind installing various third parties, it's reasonable to expect you to be able to write a script for it since it would be pretty close to your first example (something like ls -1 -- docs/public/**/*.html and run the command on that for the Zola docs for example). Also you need to control your environment so these pipelines wouldn't work on Netlify let's say (which does something similar already).

Aside: is html minifying even worth it with gzip?

senden9 commented 5 years ago

I find the comments & "ugly" blank lines in the resulting HTML a bit annoying. So i have done an implementation & documentation for this issue even before I this. See my fork. If you want I can open a pull request and we can discus the implementation.

Benchmark

Benchmark with test_site from the repo:

With minimize_html = true:

$ hyperfine --warmup 10 --prepare "rm -r public" --min-runs 200 "~/git/zola/target/release/zola build" "zola build"
Benchmark #1: ~/git/zola/target/release/zola build
  Time (mean ± σ):      80.0 ms ±   5.1 ms    [User: 68.1 ms, System: 17.4 ms]
  Range (min … max):    74.9 ms … 113.2 ms    200 runs

  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

Benchmark #2: zola build
  Time (mean ± σ):      73.9 ms ±   1.9 ms    [User: 66.1 ms, System: 16.0 ms]
  Range (min … max):    69.5 ms …  80.8 ms    200 runs

Summary
  'zola build' ran
    1.08 ± 0.07 times faster than '~/git/zola/target/release/zola build'

Without minimize_html:

hyperfine --warmup 10 --prepare "rm -r public" --min-runs 60 "~/git/zola/target/release/zola build" "zola build"
Benchmark #1: ~/git/zola/target/release/zola build
  Time (mean ± σ):      74.2 ms ±   1.7 ms    [User: 66.8 ms, System: 15.6 ms]
  Range (min … max):    70.7 ms …  78.0 ms    60 runs

Benchmark #2: zola build
  Time (mean ± σ):      74.4 ms ±   2.0 ms    [User: 66.5 ms, System: 16.0 ms]
  Range (min … max):    70.6 ms …  79.2 ms    60 runs

Summary
  '~/git/zola/target/release/zola build' ran
    1.00 ± 0.03 times faster than 'zola build'
BillBarnhill commented 5 years ago

What was the end-state on this? I am just getting start with Zola this weekend, for two blogs, and would like to have minify support baked in.

Keats commented 5 years ago

Minify support for which filetype?

samford commented 5 years ago

Minify support for which filetype?

It would be ideal if Zola had built-in minification capabilities that covered common text formats (HTML, CSS, JS, SVG, JSON, XML) but I imagine that would require quite a bit of work. On the other hand, it would take less work to add some form of HTML minification for the time being.

For what it's worth, I'm personally most interested in prioritizing minification that can't be handled through other means before zola build is run. At the moment, I can simply minify assets using external tools before putting them in the "static" directory, so minifying generated HTML is the main issue for me (since it can only be minified during or after zola build).

Aside: is html minifying even worth it with gzip?

Minifying HTML has a very small effect on transfer size with gzip but it does have a noticeable impact on file size (on the server and client). Even if the reduction is small, why store or send more data than necessary?

Looking at @senden9's fork adding html-minifier, it does lead to minified HTML but this is achieved by taking a pass at HTML files in the output folder at the end of build(). The downside of this approach is that it leads to HTML files being written twice (once after render/copy and again to minify), when it would seemingly be better to minify HTML before the file is created in the output directory the first time. [As an aside, the "minimize_html" config name is ambiguous to me and I think something like "minify_html" would be more straightforward.]

The "minify-html" branch of my fork (also using the html-minifier crate) is an attempt to minify HTML before it's written to a file. There may be a better or more elegant way to accomplish this but it seems to work as expected. One thing to mention is that any HTML files in the "static" directory wouldn't be minified, unlike the above approach. Based on the Zola docs, the expectation is that files in the "static" folder won't be modified (simply copied over), so this may be preferable.

I believe the main question at this point is whether it's fine to implement less-than-ideal HTML minification in the interim time until a mature minification library in Rust exists. If so, what would be the best way to approach it?

Keats commented 5 years ago

Based on the Zola docs, the expectation is that files in the "static" folder won't be modified (simply copied over), so this may be preferable.

Yes, I don't want to touch the files in static and it definitely needs to be minified before writing to disk if that's added.

I believe the main question at this point is whether it's fine to implement less-than-ideal HTML minification in the interim time until a mature minification library in Rust exists. If so, what would be the best way to approach it?

Why is it less than ideal? I haven't looked at the crate at all.

samford commented 5 years ago

html-minifier is built on minifier which has the following warning in its repo's README: "Please be aware that this is still at a very early stage of development so you shouldn't rely on it too much!"

The ideal would be a mature minification library written in Rust that can handle all of the previously mentioned file types and has seen plenty of use. This doesn't exist yet, so that's primarily what I mean by "less than ideal" in this context. html-minifier seems to get the job done for my limited use case, at least.

If HTML minification is opt-in (using a config value that defaults to false) and adding html-minifier doesn't noticeably impact Zola's performance (it didn't seem to in my limited benchmarking), then I personally don't have an issue with using it until something better comes along.

Granted, I haven't dug into these crates very deeply and @chris-morgan seemed to have some qualms above, so it definitely wouldn't hurt to have other folks provide their points of view as well.

Keats commented 5 years ago

html-minifier is built on minifier which has the following warning in its repo's README: "Please be aware that this is still at a very early stage of development so you shouldn't rely on it too much!"

Hmm yeah that's not great. There has been a lot of commits since the README was last edited, maybe it's not the case anymore?

sirinath commented 4 years ago

This may not be Rust but this might be of interest: https://github.com/tdewolff/minify

savente93 commented 4 years ago

The html-minifier seems to have gotten quite a few update since it was last talked about so perhaps it is worth considering it again? I am not very familiar with web stuff so I can't really asses if it's good enough. However the previous warning in the README has been removed, so that is a good sign

garthkerr commented 4 years ago

Looks like there is a rust library available for minification: https://lib.rs/crates/hyperbuild

garthkerr commented 4 years ago

I've created a prototype. There is an issue with hyperbuild and I can only get 1.40.0 to compile. Per the issues request, I'm opening it for discussion here before creating a PR.

https://github.com/garthkerr/zola/tree/minify-html

You can add minify_html = true to the root of the config.toml to enable.

Keats commented 4 years ago

hyperbuild looks good but it needs to have some way to disable the binary - no reason to pull structops when running as a library...

garthkerr commented 4 years ago

@Keats I've updated my prototype with changes from @wilsonzlin (thank you!) that removes the structops dependency. I'm happy to help shephard this into a PR: https://github.com/garthkerr/zola/commit/99f3e788d65ccee82088987392f28fa0736f7490

garthkerr commented 4 years ago

@Keats is there interest in getting this into the next release?

Keats commented 4 years ago

Potentially yes

wilsonzlin commented 4 years ago

Hello, author here :wave:, I've renamed the library to something clearer: minify-html. It also supports minifying JS now with native bindings to the super fast esbuild written in Go. However, this is an optional feature and will require a Go compiler to build and statically link to the Go library.

pierredewilde commented 3 years ago

There is a way to minify the entire Zola project with a very fast minifier from https://github.com/tdewolff/minify:

$ zola build
$ minify -r -a -o public/ public

HTML, CSS, JS, JSON, SVG and XML files are minified.

minify-html only minifies HTML and JS (embedded CSS is not minified). See https://github.com/wilsonzlin/minify-html/issues/14