kiwix / kiwix-js

Fully portable & lightweight ZIM reader in Javascript
https://www.kiwix.org/
GNU General Public License v3.0
309 stars 135 forks source link

Images are not displayed loading ZIMs with WebP format images in icecat 60.7 #835

Closed Jaifroid closed 2 years ago

Jaifroid commented 2 years ago

Issue reported by a user via email. I have been able to reproduce it in icecat running in WSL-Ubuntu (Windows 11). The images are being inserted as data URIs rather than as BLOBs, which indicates that this version is using the polyfill to read WebP.

image

Jaifroid commented 2 years ago

I don't believe this is a bug with our code. It is an incompatibility with the very old version of Icecat that is in the GNU repository. Specifically, it appears to block data URIs. There are actually newer versions that have been built, but they are previews. I understand, but I haven't tried it, that newer versions are available from GNU Guix. But this is a bit outside of my competence.

@mossroy For Firefox OS we work around the lack of support for data URIs in CSS, though data URIs for images are not blocked. It appears (only viable explanation for me) that data URIs are blocked by Icecat 60.7. A workaround (untested of course) would be to convert the data URIs produced by the WebP polyfill to BLOBs, which would probably be a simple case of using the Filereader API. What do you think? It's a JQuery-specific fix/enhancement.

Jaifroid commented 2 years ago

Actually, we have a utility that converts dataURIs to Uint8Arrays. So workaround could be as simple as:

content = util.dataURItoUint8Array(dataURIContent);

(and then attach BLOB using existing routine).

mossroy commented 2 years ago

I tested with IceCat 91.5.0esr (on windows 10, using an unofficial build from https://github.com/muslayev/icecat-win64) and I don't reproduce the problem (using https://download.kiwix.org/zim/wikipedia/wikipedia_en_ray_charles_maxi_2022-01.zim), using kiwix-js extension 3.3.0. Of course, this version of IceCat supports webp. But I also tested with webp disabled (it's possible to do that: in about:config, you can set image.webp.enabled to false): I see that the polyfill converts the webp image to png, and displays it fine. It does not seem to come from built-in LibreJS extension (which does not says it blocks anything)

I also tested with IceCat 60.7 and 60.8 (on Windows 10), and reproduced the original issue.

mossroy commented 2 years ago

IceCat 60.7 does not seem to block all dataURIs: the image is properly displayed on https://dopiaza.org/tools/datauri/examples/index.php

Jaifroid commented 2 years ago

Hmm. So maybe it just can't cope with data URIs inserted after first paint? Yet we do know it can cope with BLOBs inserted at the same point, because that is how it works with ZIMs prior to WebP. This is where we insert the data URI:

https://github.com/kiwix/kiwix-js/blob/master/www/js/lib/uiUtil.js#L55

It would be a case of adding a step to convert the data URI to a Uint8Array, and then continuing with the insert BLOB part of the same function. It might even be cleaner, because we would know this uiUtil funciton will always produce a blob, rather than sometimes producing a dataURI. The name of the function is feedNodeWithBlob (currently inaccurate!).

Of course it still might not work, but theoretically it should.

mossroy commented 2 years ago

It does not come from when the image is painted: I could reproduce with a local copy of one of our dataURI image, copied from devtools of IceCat 60.7 into a single HTML file. What is more surprising is that opening the same HTML file on Firefox 97 fails too. So there might be something wrong in the base64 encoded png?

If I disable webp on Firefox 97, and copies the same HTML tag, the content looks different (but I did have the time to check in detail), and it displays correctly (even on IceCat)

Could it be a bug in webpHero, that only happens on IceCat 60.7?

Jaifroid commented 2 years ago

Interesting. The only other browsers I have to test the WebP conversion on are IE11 and Edge Legacy (testing on Windows Mobile, as it can no longer be installed on any Windows desktop). Also IE mode in Edge Chromium. In IE11 the dataURIs (which are very very long) are inserted and the images show fine. But there is an error logged in the WebP machine. I didn't see the error (or maybe I wasn't looking hard enough) in IceCat. One of those errors just seems to be about a missing image. But the other (the first one) is a bit more suspicious (just undefined). It's probably nothing.

image

mossroy commented 2 years ago

I do not see any error message on IceCat 60.7 (with latest Ray Charles archive) I tested on Firefox 60.7 ESR (which should have the same gecko version as IceCat 60.7): I don't have the issue on images. And it's much faster, for some reason

On IceCat 60.7, if I put a breakpoint in uiUtil where it decodes webp images, the uri is: uri: ""

On Firefox 60.7, same image, same breakpoint: uri: "…" (the string is truncated)

With IceCat, I see that the base64 String has a repeated pattern, which looks suspect to me (and does not render)

Jaifroid commented 2 years ago

Excellent sleuthing! So, converting to a BLOB would not work, clearly. Very curious. I guess the repeated pattern is EDIchAwHIc.

mossroy commented 2 years ago

It would be good to test the webp-hero demo with IceCat 60.7, but it's no longer on https://chasemoskal.com/webp-hero/ It should be possible to reproduce it locally with a checkout of the source code https://github.com/chase-moskal/webp-hero and a npm start, then browse http://localhost:5000/ with IceCat. It would be a way to prove (or not) that it's an issue between IceCat and webpHero

mossroy commented 2 years ago

I can confirm that the demo of webp-hero does not work on IceCat 60.7 (with no error in console log) image even after disabling all addons: image

where it properly works on Firefox 60.7 (that does not natively supports webp either. I could check with devtools that the webp image is replaced by the png dataURI) image

mossroy commented 2 years ago

I've opened https://github.com/chase-moskal/webp-hero/issues/43 for that

chase-moskal commented 2 years ago

hello folks, i've now published webp-hero@0.0.0-dev.28, which i believe, contains a fix for this issue.

i tested with icecat 60.7 on debian.

i believe the root cause was probably an icecat anti-fingerprinting measure that prevented canvas.toDataURL from working.

the following demo, should work in icecat, after allowing the blocked javascript: https://webp-hero.chasemoskal.com/?force?useCanvasElements

i've added a new useCanvasElements option for WebpMachine, that when enabled, replaces polyfilled webp <img> elements with <canvas> elements.

to opt-into this new behavior, instantiate the WebpMachine with the option enabled:

var webpMachine = new webpHero.WebpMachine({
  useCanvasElements: true,
})

please let me know if this works for you folks, or if more adjustments are needed

:beers:

chase-moskal commented 2 years ago

if you need to decode directly, i've exposed the function webpMachine.decodeToCanvas(canvas, webpData)

https://github.com/chase-moskal/webp-hero/blob/ece6ade20f702c890641a3e548cd5e64e14fdf2f/source/webp-machine.ts#L43

mossroy commented 2 years ago

Many thanks @chase-moskal for investigating and releasing a new version so quickly! An anti-fingerprinting feature of IceCat could likely be the root cause. I suppose they have improved this feature in more recent versions of IceCat.

I've upraded to dev.28, but it was not enough.

The reason is that we don't run .polyfillDocument(), but instead convert the images ourself. If I remember well, it's because we inject the webp binary as a blob (because we read in the ZIM file), so our images look like:

<img data-kiwixurl="I/Lovemetonight.jpg.webp" decoding="async" data-file-width="300" data-file-height="300" data-file-type="bitmap" loading="lazy" src="blob:moz-extension://c33b55f3-9e5c-4ecb-8204-052a23c7e4d3/ef2d063d-c3ae-45b5-8b09-b170f37d1b95" width="220" height="220">

which is probably not handled by .polyfillDocument().

I managed to make it display an image on PR https://github.com/kiwix/kiwix-js/pull/838, by using the .decodeToCanvas you mentioned, so it's promising.

But it raises Error: cannot decode when already busy when there are many images. It's logical: .decodeToCanvas runs asynchronously, and does not allow several image conversions in parallel.

We would probably need a function similar to .decode, that would return a Promise, so that we can make sure to handle images sequentially.

Jaifroid commented 2 years ago

Just to clarify, we can't use polyfillDocument becasue we have to extract each image from a compressed archive. The URLs have to be processed by us first. Once we have each image's content as a Uint8Array, we use webpMachine.decode(). We then insert the resulting dataURI into the node ourselves:

            webpMachine.decode(content).then(function (uri) {
                // DEV: WebpMachine.decode() returns a data: URI
                // We callback before the node is set so that we don't incur slow DOM rewrites before processing more images
                if (callback) callback(uri);
                node.setAttribute(nodeAttribute, uri);
            })
Jaifroid commented 2 years ago

if you need to decode directly, i've exposed the function webpMachine.decodeToCanvas(canvas, webpData)

https://github.com/chase-moskal/webp-hero/blob/ece6ade20f702c890641a3e548cd5e64e14fdf2f/source/webp-machine.ts#L43

So we could in fact use this instead of webpMachine.decode(), right?

mossroy commented 2 years ago

So we could in fact use this instead of webpMachine.decode(), right?

That's what I did in the PR. But this function does not return a Promise that we could use to know when we can start handling next image.

chase-moskal commented 2 years ago

@Jaifroid @mossroy

in the fresh release webp-hero@0.0.1:

Jaifroid commented 2 years ago

@chase-moskal Excellent, thank you! We'll try out new release!

mossroy commented 2 years ago

@chase-moskal I've updated webp-hero to 0.0.1 from https://unpkg.com/webp-hero@0.0.1/dist-cjs/webp-hero.bundle.js This function name replaceImageWithCanvas can be found in the js file (textual search), but I don't find it at runtime in our webpMachine instance:

image

Am I doing it wrong (that's quite possible), or is the function not exported?

chase-moskal commented 2 years ago

@mossroy the static functions will be available on the constructor.

// access the function directly on the constructor function
WebpMachine.replaceImageWithCanvas(image, canvas)

// you may find the constructor on the webpHero object,
// depending on how you loaded webp-hero
webpHero.WebpMachine.replaceImageWithCanvas(image, canvas)

// access the function from an instance
webpMachine.constructor.replaceImageWithCanvas(image, canvas)
mossroy commented 2 years ago

Thanks @chase-moskal , it's working fine through the constructor.

Our main concern now is to know when we should use decodeToCanvas instead of decode. If we don't find a reliable way to detect that, we'll add a setting in the app. By default, it would use decode.

Jaifroid commented 2 years ago

@mossroy Looking at the source code for WebpMachine, I think we should always use decodeToCanvas() because we can then extract a BLOB from the canvas in non-anti-fingerprinting browsers, instead of a dataURI. The reason this isn't any kind of degradation is that decode() actually calls dcodeToCanvas(), waits for its result and then does canvas.toDataUri(). In other words, decodeToCanvas() is upstream of decode(). It's more flexible for us to have the underlying canvas because then we can simply get a BLOB and handle it with our existing function to insert the BLOB URL into the DOM.

See https://github.com/chase-moskal/webp-hero/blob/master/source/webp-machine.ts#L106

chase-moskal commented 2 years ago

:tada: webp-hero@0.0.2 now auto-detects a default value for useCanvasElements, allowing it to work out-of-the-box on icecat and other browsers

Jaifroid commented 2 years ago

Wow, you rock @chase-moskal! I'll try these out: it will simplify our code, and will also help with Service Worker mode where I think polyfillDocument() is used (your polyfill is included in the ZIM).

Jaifroid commented 2 years ago

@chase-moskal How do I use detectCanvasReadingSupport()? I can see it in the bundle I got from https://unpkg.com/webp-hero@0.0.2/dist-cjs/webp-hero.bundle.js, but it's not exposed directly on webpMachine, nor on webpMachine.constructor. Sorry to be a bit dense!

chase-moskal commented 2 years ago

@Jaifroid

sorry about the confusion.

Jaifroid commented 2 years ago

Thanks @chase-moskal , I've now incorporated the new detection function into the PR:

Interestingly, when testing the new implementation, after having manually disabled it, when I turned it back on I got a the popup below appearing in IceCat. It seems the IceCat browser contains an option to disable the anti-fingerprinting on a per-domain basis. But it's clearly not well exposed, as this is the first time I've seen it. It appears to be exposed by the testing technique.

image

Jaifroid commented 2 years ago

And indeed, if I answer "Allow data access" to that question (though I unchecked Always remember my decision, so that I can continue testing), then the webpHero detect function returns true (i.e. workaround not needed), and images load OK...

Jaifroid commented 2 years ago

OK, it turns out that IceCat provides a way to turn off the canvas blocking. User needs to click the little image icon (circled in screenshot below), and turn it off. I wish we'd known this earlier!

@mossroy Do you want to keep #838 ? I think we should definitely upgrade to use the latest WebpHero. An argument in favour is that the detection is generic, so it may apply to other browsers or browsers that have an extension that blocks canvas fingerprinting. Clearly they'd have to be old browsers that don't support WebP natively to use this. An other argument in favour is that it took us a long time to find this setting by accident, whereas with the workaround detection, it works "out of the box". But it's up to you!

image

Jaifroid commented 2 years ago

@chase-moskal A big thank you for your super-rapid response on this issue, adding all this support upstream in the Polyfill. It's been very good to collaborate with you across these projects!