openzim / mwoffliner

Mediawiki scraper: all your wiki articles in one highly compressed ZIM file
https://www.npmjs.com/package/mwoffliner
GNU General Public License v3.0
275 stars 72 forks source link

Fatal CORS error caused by injected mwOffliner script `wm_mobile_override_script.js` in Kiwix Serve and Kiwix JS / PWA #2075

Open Jaifroid opened 1 month ago

Jaifroid commented 1 month ago

I was bashing my head against a brick wall wondering why the new Wikivoyage app release with the August Wikivoyage is crashing with a fatal CORS error when clicking on external links and custom protocols (URI schemata). Eventually I discovered that it is due to strange code in this wm_mobile_override_script.js injected by mwOffliner:

https://github.com/openzim/mwoffliner/blob/main/res/wm_mobile_override_script.js#L10

For convenience, here is the offending bit:

    const linkElements = document.querySelectorAll('a');
    const disabledElems = Array.from(supElements).concat(Array.from(linkElements))
    disabledElems.forEach((elem) => {
         elem.addEventListener('click', (event) => {
             event.stopPropagation();
          }, true);
    });

This code is adding event listeners to every single anchor element in the article (iframe in the case of Kiwix Serve and Kiwix JS/PWA). This code intercepts clicks, but because it uses stopPropagation() and true it disables subsequent processing by click detection in our apps. This processing is crucial for any JS-based solution with a sandboxed iframe, because it is the only way we can intercept, without altering the DOM, custom protocols (and a bunch of other things, including external links) that will violate the sandbox of the iframe, produce a CORS error, and effectively disable the app entirely until the user restarts.

To test in the latest Wikivoyage on Kiwix Serve, go to https://library.kiwix.org/viewer#wikivoyage_en_all_maxi_2024-08/A/Paris, find an external link (they are oddly coloured white, so a bit hard to see) and click. You'll get a screen like in screenshot at bottom of this post (same in KJS).

In general, I don't think it's a good idea for mwOffliner to inject scripts that alter the DOM so extensively (every anchor) client-side, because it isn't safe in all readers and especially not in JS-based readers. IMHO, it should do any changes needed either by reforming the HTML as necessary at scrape time, or through inclusion of CSS override scripts.

Does anyone know what the script is meant to do and why it's needed? I can remove it before it is injected in the PWA and KJS, but that's a dirty workaround.

image

audiodude commented 1 month ago

Thanks for the excellent work finding this!

I think I am a bit lost on what the "proper" or even "traditional" way of handling issues like this in mwoffliner is. It looks like this code (https://github.com/openzim/mwoffliner/blob/main/src/renderers/abstractMobile.render.ts#L17) specifically tries to include scripts that are part of the HTML output of the API, at least ones that have /mobile in their path.

So it seems the approach, as implemented, is to capture scripts that are part of the API output, download them and include them in the ZIM, and re-write the HTML code so that they are included using the new path.

IIUC, the problem is that, at least in this case, the script is destructive and needs to be withheld. This is probably a symptom of the fact that mobile-html is not "for" us, it's for the wikipedia mobile app. So seemingly nonsensical code like this is either designed to interface with local app-side JavaScript, or meant as a workaround for a native behavior.

On the flip side, presumably some amount of this JavaScript is actually necessary for the page to function? I don't have any intuition on what parts that might be.

audiodude commented 1 month ago

Sorry apparently I'm very confused.

The script you linked is part of the mwoffliner repo and is being included by us!

image

I'm looking through the history of that file to try to discern what it's supposed to do.

Jaifroid commented 1 month ago

Thanks, @audiodude!