adobe / helix-home

The home of Project Helix
54 stars 82 forks source link

Adding height and width attributes to img tags #146

Open trieloff opened 4 years ago

trieloff commented 4 years ago

Overview

People on the internet say it's a good idea for performance.

There are two problems:

  1. it needs custom CSS to look good
  2. We do not have height and width info in the pipeline

Details

A possible opt-in solution:

  1. In helix-markup.yaml configure img tags to be replaced with esi:include tags that use (@ramboz: is this possible?) a .img extension to point to helix-static
  2. helix-static fetches the image and renders height and width (in the future it may do even more like srcsets and DAM-powered art direction)

Proposed Actions

and now?

tripodsan commented 4 years ago

note that this puts extra strain on helix-static. reading the dimensions can be expensive, especially when the image is large. it is important to only download enough to read the image headers....

alternatively, we could store the dimensions in the blob metadata and retrieve them via a HEAD request. the image metadata could be added during after the image is uploaded to azure. as long as the meta data is missing, we would deliver the image tag w/o dimensions.

note, that the esi:include link needs to include all attributes to be placed on the image. (class, title, alt, etc).

stefan-guggisberg commented 4 years ago

Regarding determining image size: we are using image-size here. It's very popular and lightweight. IIRC it's not always possible to determine the dimension from the first 1k bytes...

trieloff commented 4 years ago

Great points.

note that this puts extra strain on helix-static.

We can also consider creating a new helix-image action for that. But in general, ahead-of-time metadata extraction beats just-in-time extraction.

alternatively, we could store the dimensions in the blob metadata and retrieve them via a HEAD request.

At first I thought this would only be useful for google and word, but we can just upload images from GitHub, too.

ramboz commented 4 years ago

In helix-markup.yaml configure img tags to be replaced with esi:include tags that use

I don't think this would work, I'd be surprised if the "expansion" library we use to generate the HTML tags supports esi tags… but might be worth a try

tripodsan commented 4 years ago

At first I thought this would only be useful for google and word, but we can just upload images from GitHub, too.

I rather though of a background process that scans the newly uploaded blobs...this could also help with the qr-code processing for larger images.

also, there was once the idea of uploading all the images to azure (see https://github.com/adobe/helix-static/issues/88)

stefan-guggisberg commented 4 years ago

I rather though of a background process that scans the newly uploaded blobs...this could also help with the qr-code processing for larger images.

FWIW: a while ago I wrote an Azure Blob Triggger that deals with QR code decoding: https://github.com/stefan-guggisberg/azure-blob-trigger

tripodsan commented 4 years ago

Azure Blob Triggger

and why don't we use this :-) ?

stefan-guggisberg commented 4 years ago

and why don't we use this :-) ?

someone voiced concerns regarding using cloud platform specific APIs IIRC...

tripodsan commented 4 years ago

someone voiced concerns regarding using cloud platform specific APIs IIRC...

I wonder who that was :-)

In hindsight, it might be a good solution to avoid overloading the runtime actions during document processing and also allow for async extraction of the non-document originating blobs.

trieloff commented 4 years ago

yes, I'd reopen adobe/helix-static#88

if we start with asynchronous asset processing, would it make sense to process with NUI?

trieloff commented 4 years ago

one more thought: I had a concern that async processing might lead to slow delivery or to caching responses without size info, but there is a good way around this: until the size info is available, deliver with short-lived cache headers, afterward with long lived headers.

stefan-guggisberg commented 4 years ago

if we start with asynchronous asset processing, would it make sense to process with NUI?

Wouldn't that be overkill for our simple use case (determining size and QR decoding)? I'd prefer Azure Blob Triggers, dead simple and way more efficient (they run on storage).

trieloff commented 4 years ago

ok. we just must not miss the point when two simple things become a dozen complex things.

davidnuescheler commented 3 years ago

after thinking about this for a while, i think we should definitely support this. while there are not a lot cases where this actually matters for CLS there are definitely some.

i think the implementation would rely on the extraction of the height and width of embedded images at the time of extraction (eg. word2md and gdoc2md) and possibly close to the point where the content addressable hash is calculated, we would parse the relevant parts of the jpg/gif/png headers to extract the height and the width. the height and the width would probably have to be stored in .md to allow for quick access during delivery, which opens a discussion on the notation we want to use.

trieloff commented 3 years ago

which opens a discussion on the notation we want to use.

There is a CommonMark proposal that has been implemented in pandoc that looks like this:

![](file.jpg){ width=50% }

The proposal hasn't found consensus and there is no remark-plugin either.

The full-to-spec alternative would be a plain <img> HTML tag.

tripodsan commented 3 years ago

I think we rather just want to transport the width/height, but not use it. so maybe encode it in the fragment like we do with the type:

![](https://main--repo--owner.hlx.live/media_11223344#image.png;width=1200;height=800

or

![](https://main--repo--owner.hlx.live/media_11223344#image.png?width=1200&height=800

or

![](https://main--repo--owner.hlx.live/media_11223344?width=1200&height=800#image.png

although the last one is probably not desired either.

trieloff commented 3 years ago

I like your second option best.