jgm / pandoc

Universal markup converter
https://pandoc.org
Other
34.32k stars 3.37k forks source link

Optionally embed SVG as data URI instead of inlining? #9787

Closed allefeld closed 4 months ago

allefeld commented 4 months ago

When an SVG file is embedded in HTML output with --embed-resources, Pandoc inlines the SVG code into the HTML code. This differs from the behavior for other image file types, which are embedded by setting the img src attribute to a data URI. This special behavior makes sense, because the data URI unnecessarily obfuscates things.

However, it has the unfortunate side effect that the SVG image can no longer be easily saved to a file or opened in a separate tab from a browser. I.e. right-clicking on the image opens a context menu which lacks the entries "Open image in new tab", "Save image as..." etc. (tested in Brave, Chromium, and Firefox on Linux).

Example HTML to demonstrate the difference:

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="" xml:lang="">
<body>

<p><svg id="svg4" role="img" width="300" height="130" xmlns:svg="http://www.w3.org/2000/svg" viewBox="0 0 300 130"><defs id="svg4_defs8"></defs>
  <rect width="200" height="100" x="10" y="10" rx="20" ry="20" fill="blue" id="svg4_rect2" />
</svg></p>

<p><img src="" /></p>

</body>

The two images look exactly the same, but the first cannot be saved to a file while the second can.

I find it useful for a user to be able to access an image separately from the main document.

Therefore my feature request: Could we have an option to disable the inlining of SVG files (with --embed-resources) such that data URIs are created instead?

jgm commented 4 months ago

Same thing on Safari. Well, maybe we need a command line option for this.

frederik-elwert commented 4 months ago

It also has other side effects. E.g., the default CSS for revealjs presentations sets the max width for img, but not for svg, which results in SVG images behaving differently depending on whether --embed-resources is set or not. So yeah, having inlined SVG is kind of a neat idea, but I agree sometimes just treating them as any image makes things easier.

jgm commented 4 months ago

E.g., the default CSS for revealjs presentations sets the max width for img, but not for svg

One solution to this would be to add a rule for svg to the default CSS.

jgm commented 4 months ago

Not sure what the right interface would be for this. One option would be to make it configurable on a per-image basis, e.g. by inlining sensitive to a no-inline class on the image (which could be added globally by a Lua script if you want to affect all images). This would allow you the flexibility to inline some SVGs but not others.

Or there could be an option like --inline-svgs (maybe the default could change to not inlining). However, this is potentially confusing: it would only have an effect if --embed-resources were used (since the relevant processing happens in SelfContained), and that might not be transparent. Alternatively, we could change things so that --inline-svgs was independent of --embed-resources; this would allow you to inline SVGs on any document without embedding anything else...

allefeld commented 4 months ago

Because inlined SVGs act differently than SVGs included via img tags, my impression is that the former are meant for things which are part of the page design (e.g. a logo in a header or footer) while the latter are meant to be part of the content. If we follow this admittedly speculative interpretation, I think images included with Markdown syntax (![]()) should stay img tags, with a data URI if --embed-resources, while SVGs included in HTML templates should be inlined. If it is hard to distinguish between these two cases for Pandoc, a special class could be provided for use by the template author.

So my proposal would be: --embed-resources by default treats img tags referencing an SVG the same way as other image types, i.e. creates a data URI, unless the img tag has a class embed-inline.

jgm commented 4 months ago

Let's get input from @nxn-4-wdf who proposed the inline SVG feature in #8948.

jgm commented 4 months ago

As I understand it, a big part of the original issue was that prior to the change, the SVGs were converted to PNGs and then linked as data URIs in img tags. This was problematic because of a loss of resolution (moving from a vector to a bitmap format). I don't know why we didn't consider using a data URI with the encoded SVGs, since as far as I know SVGs can be used in img tags. But I see a lot of contradictory suggestions on this on the net, and I'm really not up on this. What's the current advice about using SVGs in img tags?

allefeld commented 4 months ago

Today, SVG is a standard image file format and can be used basically everywhere in HTML & CSS where other image files are used: MDN SVG as an Image. Displaying SVG via img is supported by all major desktop and mobile browsers: CanIUse SVG in HTML img element.

The difference to other image file formats is that SVG is an XML application and a subset of HTML5, and can therefore additionally inlined into HTML. That, too, is widely supported: CanIUse Inline SVG in HTML5.

I didn't find any consistent advice on inline vs img either, just that according to MDN SVG as an Image there are the following restrictions if used via img vs inline:

  • JavaScript is disabled.
  • External resources (e.g. images, stylesheets) cannot be loaded, though they can be used if inlined through data: Ls.
  • :visited-link styles aren't rendered.
  • Platform-native widget styling (based on OS theme) is disabled.

However, since without --embed-resources Pandoc references SVG image files through img tags, I don't see a reason to change that with --embed-resources. Since SVG is a standard image file format, it should be treated the same as other formats. If the additional possibilities opened by inline SVG are relevant to a Pandoc user, that is orthogonal to the question of embedding.

But to keep those possibilities available, new proposal: By default, ![](image.svg) translates to <img src="image.svg">, and with --embed-resources to <img src="data:…">. To enable inline SVG, ![](image.svg){.inline-svg} translates to <svg …>…</svg regardless of --embed-resources.

Regarding #8948: Rendering SVGs to a bitmap for embedding is obviously not the right thing to do, because of the loss of resolution as pointed out in the issue. I suspect @nxn-4-wdf was simply not aware that SVGs work as data URI with img? And as they clarify, inlining is actually quite complicated because it needs various fixes to the SVG code.

jgm commented 4 months ago

But to keep those possibilities available, new proposal: By default, ![](image.svg)translates to <img src="image.svg">, and with --embed-resources to <img src="data:…">. To enable inline SVG,![](image.svg){.inline-svg} translates to <svg …>…</svg> regardless of --embed-resources.

I like that.

jgm commented 4 months ago

One nice thing the inlining gives us, though, is compression. If you have many images with the same SVG, pandoc will use <use> tags to refer back to the earlier ones. With data URIs we'd have an encoded version of the entire SVG in each img tag.

nxn-4-wdf commented 4 months ago

Let's get input from @nxn-4-wdf who proposed the inline SVG feature in #8948.

Hi @jgm , I see two different use cases:

  1. My use case is about small SVG icons, which may be included several times in the HTML. Here, inlining the whole SVG text once and then referring it with a <use> tag makes sense: it saves space and can be compressed. Another advantage with inlining is the ability to use advanced CSS-styling methods : inside of a <div> with a specific class, the icon stroke will be red, inside of a <span> with another class, it will be blue, etc.
  2. @allefeld 's use case is valid too: he wants to embed like another image, as before.

Yes, inlining as with 1. breaks the existing behaviour, and should be made opt-in.

Using an attribute like {.inline-svg} (or the opposite) for each inline image seems a little bit heavy, especially if you have many images like this. Would there be a way to set flags/variables in the document ? With the meaning "from this point, embedded SVG images will be (inlined) or (img with data-uri)" ? I thought about YAML metadata blocks, but they do not work, because:

A document may contain multiple metadata blocks. If two metadata blocks attempt to set the same field, the value from the second block will be taken.

jgm commented 4 months ago

You can also create a filter that adds inline-svg class to images inside, e.g., a Div with a certain class.