tsayen / dom-to-image

Generates an image from a DOM node using HTML5 canvas
Other
10.28k stars 1.68k forks source link

Failed to generate image when html content including string "%E5" #251

Open gnllk opened 5 years ago

gnllk commented 5 years ago

Use case: description, code

<div id="my-node">%E5</div>

Library version

2.6.0

Browsers

zrajm commented 5 years ago

I'm running domtoimage.toPng() on a simple <div> with some text in it, and accidentally noticed that any URL-encoded data in my <div> is actually URL-decoded before being converted to PNG. Meaning that:

<div>%41</div>

Comes out as an image containing a capital "A" rather than the expected "%41".

I think your bug is just a special case of that since %E5 will result in a malformed UTF-8 character.

It is also worth noticing that the resulting image width is the same as the width of my div, leaving a lot of extra empty space on the left side of the image (since "%41" is narrower than "A").

Browser

zrajm commented 5 years ago

Okay. The fix was just to insert a encodeURIComponent() in the right place. I forked and did the change.

@tsayen, feel free to merge the change should you so desire. :)

gnllk commented 5 years ago

thanks a lot. it worked.

zrajm commented 5 years ago

This works fine in Chrome, but I've just discovered that this fix actually breaks Unicode webfonts in Firefox. :( – I'm using a Truetype Unicode font (with my own Private Use Area glyphs for sign language transcription, so I'm running into corner cases plenty).

I'm trying to find a fix for that issue now.

zrajm commented 5 years ago

I guess I didn't read the code thoroughly enough when I did my fix... :/

The problem was caused at an earlier stage in the code by the line .then(util.escapeXhtml) – which (despite its name) does URI-encoding of the characters # and \x0A (line feed) these characters where then double encoded resulting in garbage. This garbage happen to occur in Firefox because Firefox seem to preserve line feeds in the CSS code (causing \x0a first becoming %0a and double encoded as %250A which would later be decoded back to %0A – resulting in a syntax error in the CSS, causing the following CSS instructions to be ignored).

Frankly I have no idea why the code was doing URI encoding at that stage. Nor why it attempted to hide the fact by calling the encoding function escapeXhtml. My fix now works in both Chrome and Firefox, though.