getzola / zola

A fast static site generator in a single binary with everything built-in. https://www.getzola.org
https://www.getzola.org
MIT License
13.76k stars 957 forks source link

Picture & thumbnails support #225

Closed vojtechkral closed 6 years ago

vojtechkral commented 6 years ago

Hi, I have recently started using Gutenberg and I like it very much (especially compared to other more famous static site generators that have ridiculously complex configuration).

One thing I've been misssing is the ability to create picture/photo galleries in a blogpost. And so I went ahead an added the implementation using the Piston/image library. You can have a look at the work in progress at my pictures branch. (It's just one or two evenings of hacking and so stuff like proper docs and tests is still missing unfortunatelly, but the implementation itself is present and seems to work well so far for me.)

Feel free to let me know whether you think it is worth finalizing / if you'd consider integrating this in future.

A disadvantage of generating thumbnails is that it's time-consuming, and with Gutenber re-generating the site each time from scratch, this can become noticeable. Generating thumbs for some 30 photos takes some 10 seconds on my machine (each photo being about 6MB in size).

In default configuration there is no thumb generation, the feature is entirely opt-in.

Keats commented 6 years ago

I have recently started using Gutenberg and I like it very much (especially compared to other more famous static site generators that have ridiculously complex configuration).

Glad you like that!

I believe there's value in having Gutenberg handle image resizing but I'm not sure i would go with an approach of just doing it for all pictures. In my mind it would be more of a Tera filter like image_url | img_resize(width=400, height=300) and the filter would look for an already resized image at that path (for example a pic called "cat.png" in the filter above could be named "cat_400x300.png") and if not found perform the resizing. That should allow the thumbnail resizing to only happen once but it can happen anywhere in your site.

The downside being that adding a picture in a .md file would now require using a shotcode but I believe if you want thumbnailing, you will also probably want some kind of special CSS and use shortcode anyway.

What do you think?

vojtechkral commented 6 years ago

@Keats Yes, I'm going to have to use shortcodes anyway.

Regarding template function, the thing is I'm not just inserting individual images, but whole galleries, ie. 10 to tens of pictures. In my previous solution I was using Pelican with a gallery plugin, the plugin feeds a list of pictures to the template, which uses PhotoSwipe to create a clickable gallery. So the problem is actually twofold:

In my implementation, this is done by matching page assets against a configurable regex, I guess you could configure the regex to only match files with a specific pattern...

OTOH I suppose I could also go with a simpler approach...

vojtechkral commented 6 years ago

Presumably I could also create a Tera filter to get multiple images and their thumbnails? I'll look into that...

Keats commented 6 years ago

Is the pelican blog on github/gitlab so I can see how it's setup?

vojtechkral commented 6 years ago

@Keats Not right now unfortunatelly. It's a work in progress, I'm trying to replace my antiquated and stale personal blog. But I'll try to upload it somewhere for you...

mdaffin commented 6 years ago

Hey, I am also interested in this feature. I am in the process of migrating my blog from Hugo (I really started to dislike golangs templating language) and so far this is the only missing feature I have noticed.

It is worth considering how they implemented it, which is similar to what you suggested with the filter approach but also support cropping and fitting into a fixed size (so the image is not large than the given width/height but preserve the aspect ratio).

They also render the images to a resources folder designed to be committed to version control to save on future processing (but can also be added to ignore files if the user chooses not to do this). Given how rarely images this change and how expensive they are to generate you might want to consider doing something similar.

Keats commented 6 years ago

I really started to dislike golangs templating language Tell me about it!

I like saving the resized pictures in a committed folder but I don't really want to have tons of folders at the root level. Also need to think how it interacts with co-located resized pictures.

I think the filter approach is the best imo but I would start conservatively and just have the resize options to start with. I feel like if you care about your images, you will do the the cropping yourself. I've implemented something extremely similar to the smartcrop library they are using recently and I would say it only works for a limited number of images.

@mdaffin Is your site open-source? I'm looking for examples of sites using image resizing/cropping etc to get some ideas.

vojtechkral commented 6 years ago

@mdaffin @Keats

In my implementation the aspect ratio is also preserved, of course.

I am in the process of converting my implementation to use filters instead. I was a bit busy last couple of days so it didn't go as quickly as I would've liked.

Also, I found out a filter actually cannot be used to accomplish this, because in Tera a filter cannot be a closure, it can only be a context-less function which doesn't have access to the needed information (such as content dir path and public dir path). I am currently using Tera global functions instead as they can be closures.

Keats commented 6 years ago

Another usecase example: I'm generating a gallery for Gutenberg themes based on some repos and those repos has one big screenshot.png. I'd like to be able to generate thumbnails without repo owners having to do it themselves: Gutenberg would do it automatically.

vojtechkral commented 6 years ago

@Keats Great example! I have updated my implementation such that configuration is no longer involved. I now have a global function named img_scale that receives either a single image or a list of multiple images (you can pass the page.assets variable to it too) and which generates thumbnails with filenames similar to what you suggested.

I'm trying to add a usage example to the documentation, however, right now I have run into a problem: It seems Tera global functions are not available in shortcodes. I'll poke around and see if I can do something...

vojtechkral commented 6 years ago

So, it turns out the shortcode generation happens much earlier in the process and has no access to all the metadata plus the output directory isn't created at that point... This is much more difficult :-)

Keats commented 6 years ago

So, it turns out the shortcode generation happens much earlier in the process and has no access to all the metadata plus the output directory isn't created at that point... This is much more difficult :-)

Ah I didn't think of that :( The shortcode generation is going to be rewritten soon but the problem is still going to persist. Something that might work is to work create the output dir at the beginning, put all resized pictures in a public/gutenberg_pictures and have the resize filter return a URL to that path.

vojtechkral commented 6 years ago

@Keats The output is not so much of the problem, the input is - with shortcodes basically you'd have to manually enumerate all images with full path...

The shortcode generation is going to be rewritten soon

Really? One thing that occured to me is that is would be sweet to be able to use arbitrary Tera in the page markdown that would be evaluated with the same context as page template. But that would be pretty hard to implement...

Keats commented 6 years ago

Really? One thing that occured to me is that is would be sweet to be able to use arbitrary Tera in the page markdown that would be evaluated with the same context as page template. But that would be pretty hard to implement...

It could be possible but I'm not sure it's necessarily something wanted. Discussion for shortcodes is at https://github.com/Keats/gutenberg/issues/165

vojtechkral commented 6 years ago

@Keats I've uploaded an example for you. Here's the rendered html

http://kral.hk/tmp-pelican/blog/en/nixie-clock-with-esp8266-and-ds3231/

Here's the two pelican plugins that process pictures and render the html:

https://gist.github.com/vojtechkral/feee49fed758d178b6a045449a511517

The code is pretty horrible unfortunatelly.

Keats commented 6 years ago

FYI I'm going to have a look at adding support for that in the next version in the form of a global function so what you currently have in your branch I believe. We can deal with the shortcode issue later on.

How I envision it: resize_image(path: string, width?: int, height?: int, fit?: bool, fill?: bool) -> String and a static/_resized_images folder (names open to bikeshedding).

The parameters ending up being used for the operations in https://gohugo.io/content-management/image-processing/#image-processing-methods Resize, Fit and Fill but in one method. From the path and the arguments, a unique filename will be created (something like${path-hash}-4${filename}-${width-"x"-"height}-${method}?). The global function will first check in the _resized_images folder for a file with that name and return the public url for that file if there is one. If not, it will do the resizing and put the image in that folder with the right name and return the public url. Gutenberg will automatically detect the new image and copy it to the public directory. Keeping it into the static folder ensures it's saved in vcs.

I would make the fn take only one path though and in your case the gallery will be just multiple calls to it in a forloop.

What do you think?

vojtechkral commented 6 years ago

names open to bikeshedding

About that... :-) ... maybe the fit / fill stuff could be made one param named op or such with a string (enum-like) value of "fit", "fill" et al.

The global function will first check in the _resized_images folder for a file with that name and return the public url for that file if there is one. If not, it will do the resizing and put the image in that folder with the right name and return the public url.

Sounds good, I've already written a function named file_stale or something like that in utils::fs that tells whether a target file exists and needs updating or not...

Gutenberg will automatically detect the new image and copy it to the public directory.

That's ok, but it would be nice if Gutenberg copied the static files preserving the ctime/mtime metadata. The way it is done now the copied static files get an updated ctime/mtime and my guess is that with people uploading their public dir to a server via FTP or someting similar, this would make browsers re-download the static files even if they haven't changed.

I would make the fn take only one path though and in your case the gallery will be just multiple calls to it in a forloop.

That's fine as long as those page.assets context data are available...

What do you think?

Sounds good to me! I'll update the implementation in my branch to these specs. The only part I'm not sure how to go about is the path hashing. Is there a precedent for this or do we just slap md5 on it an call it a day? :-)

vojtechkral commented 6 years ago

Or if you wanted to implement it yourself instead let me know...

Keats commented 6 years ago

About that... :-) ... maybe the fit / fill stuff could be made one param named op or such with a string (enum-like) value of "fit", "fill" et al.

Could work, I have no strong feelings one way or the other. The op param is probably better actually.

The way it is done now the copied static files get an updated ctime/mtime and my guess is that with people uploading their public dir to a server via FTP or someting similar, this would make browsers re-download the static files even if they haven't changed.

Do browsers even check that? They would need to download the image to get the metadata

That's fine as long as those page.assets context data are available

Yep we can make everything we need accessible later on.

Is there a precedent for this or do we just slap md5 on it an call it a day? :-)

Probably fine, it's just that relying on just the filename will give duplicates in some cases (for example in Gutenberg docs there are a few assets called screenshot.png). We probably don't even need the full md5 of the path, the first ~6 chars or so should be enough.

Or if you wanted to implement it yourself instead let me know...

Go for it!

vojtechkral commented 6 years ago

Do browsers even check that? They would need to download the image to get the metadata

Not browsers, web servers. Typically they're configured such that with static files they compute the ETag based on the file's ctime/mtime metadata, and so it becomes part of the regular browser cache invalidation process. Browser sends the If-Modified-Since header (if memory serves me well) and server checks the file ctime/mtime and responds with either 304 (just headers) or 200 + full file data.

Probably fine, it's just that relying on just the filename will give duplicates in some cases (for example in Gutenberg docs there are a few assets called screenshot.png). We probably don't even need the full md5 of the path, the first ~6 chars or so should be enough.

Got it...

dstutman commented 6 years ago

It might also be nice to provide a function that will let you get srcset strings, similar to Gatsby. I have some pictures on my site that are automatically fitted to the screen, but on cell phones loading the full image (In some cases 4K) is excessive. I think this might, for example, provide an array of the image URIs which you can iterate over and add to tags.

Keats commented 6 years ago

TIL people would pay $600 for a workshop on a static site engine: https://workshop.me/2018-04-gatsby

It looks like it would be worth it to take some of these ideas into the gutenberg image support yes

vojtechkral commented 6 years ago

If I understand correctly, you could just make a few resize_image calls to fill in the srcset, right?

Btw @Keats I have the implementation pretty much done, right now I'm just procrastinating a bit on the more boring tasks like tests and documentation. Some implementation details:

I've been pretty happy with the performance so far, even hundreds of fairly large images were typically processed in very reasonable time. I'll gather some numbers...

Edit: On my laptop: 650 ~10Mpix imgs → 237 seconds 100 ~13Mpix imgs → 41 seconds at thumb size of 240×160

vojtechkral commented 6 years ago

It might be good to add some kind of an unobtrusive progress reporting (only activated when the processing takes a long time) so that the user knows the application isn't stuck...