openzim / warc2zim

Command line tool to convert a file in the WARC format to a file in the ZIM format
https://pypi.org/project/warc2zim/
GNU General Public License v3.0
40 stars 5 forks source link

Zimit2: HTML demos on developer.mozilla.org (MDN) pages are not working #293

Open benoit74 opened 1 month ago

benoit74 commented 1 month ago

image

See https://dev.library.kiwix.org/content/developer.mozilla.org_en_all_2024-05/interactive-examples.mdn.mozilla.net/pages/tabbed/section.html for isolated test case.

This worked well in zimit1 and is working well in replayweb.page

URL and other arguments passed by wombat to our URL rewriting function are unusual:

urlRewriten:
    - current_url: http://135.181.181.97:8888/content/devmdn-bcd_2024-05/interactive-examples.mdn.mozilla.net/pages/tabbed/section.html
    - orig_host: interactive-examples.mdn.mozilla.net
    - orig_scheme: https
    - orig_url: https://interactive-examples.mdn.mozilla.net/pages/tabbed/section.html
    - prefix: http://135.181.181.97:8888/content/devmdn-bcd_2024-05/
    - url: http://mp_/blob:57e8dbed-cfaf-4e0b-a259-a23170aaf29c/https://interactive-examples.mdn.mozilla.net/pages/tabbed/section.html
    - useRel: false
    - mod: if_
    - doc: undefined
    - finalUrl: http://135.181.181.97:8888/content/devmdn-bcd_2024-05/mp_/blob%3A57e8dbed-cfaf-4e0b-a259-a23170aaf29c/https%3A//interactive-examples.mdn.mozilla.net/pages/tabbed/section.html

mod = if_ and url with mp_ seems pretty special

benoit74 commented 1 month ago

Postponing to 2.1, not an easy feat to be solved.

Jaifroid commented 3 weeks ago

It looks like this might be related to the encoding of the querystring? In PWA I see this. Note that the app tried to call the contents with an encoded querystring, as we had decided, but the contents are actually stored in the ZIM with a plain question mark and other separators unencoded. EDIT: This is incorrect, please ignore.

Jaifroid commented 3 weeks ago

Apologies, that was a red herring. It's just that that part of the console dev tools shows the requests in this way (encoded). It's the same for video that DOES work, so it's not that.

These code demo panels seem to work fine when it's a case of showing how JavaScript functions work, but they don't work when the content is HTML. Which is a bit strange.

benoit74 commented 1 week ago

To be investigated in 2.1 to at least better understand the problem

benoit74 commented 1 week ago

Explanation and dirty hack found and working well, ticket opened in webrecorder/wombat to find the proper solution to fix this, there is definitely lots of hope ^^