kiwix / libkiwix

Common code base for all Kiwix ports
https://download.kiwix.org/release/libkiwix/
GNU General Public License v3.0
120 stars 56 forks source link

kiwix-serve: issues with the viewer for ZIMs using fragment in the URLs #1120

Open benoit74 opened 2 months ago

benoit74 commented 2 months ago

I'm hopping this is not a duplicate issue, since we are multiple to be convinced this is a known bug/limitation, but we struggle to find the corresponding issue.

General problem description

Why it is becoming important

youtube and freecodecamp scraper (and soon Kolibri, and maybe others) are using Vue.JS ; this means we have a single HTML page and a single JS file, and navigation is handled by JS code.

For proper ZIM operation, we are using the fragment to navigate from page to page (what Vue.JS calls "hash mode" history, see https://router.vuejs.org/guide/essentials/history-mode). This is needed because we really have only one HTML ZIM entry in the ZIM, since we have a single page application.

The problem is however not specific to Vue.JS framework, and is in fact even present without any JS framework, it is just way more visible/important in this scenario.

Real-world example

See: https://library.kiwix.org/viewer#canadian_prepper_winterprepping_en/index.html ; when you click on any video, the proper video is loaded but nothing changes in browser URL bar. If you want to reload or bookmark a video, you can't.

But https://library.kiwix.org/content/canadian_prepper_winterprepping_en/index.html is working OK.

How other readers are behaving

See https://github.com/kiwix/overview/issues/107 for a complete status.

Alternatives

Should it be too complex to solve this bug, two alternatives to workaround this bug are possible with current usage of Vue.JS, but maybe not with all JS framework and will not work for scenarii without a JS framework.

The main drawback of these alternatives is that it needs modifications of all readers and does not cover all use-cases of the fragment

Alternative 1

In Vue.JS , it is possible to use "HTML5 mode" history, where the fragment is not used but the JS framework intercept the URL to display proper page (see https://router.vuejs.org/guide/essentials/history-mode).

For this to work, we need a new feature in the ZIM with a "catch-all" ZIM path metadata and a "catch-all" logic in readers. When any path requested is not found by the reader in the ZIM, the reader will return the content of the "catch-all" ZIM path metadata (the index.html in our case).

It obviously has side-effects since we do not have anymore 404 errors for really bad "paths", but in our scenario this is handled by the Vue.JS application which display on its own an error page (just like it does currently if the user pass a bad fragment value).

Alternative 2

In youtube and freecodecamp (and other scrapers), we are already creating additional ZIM entries for every item that we want to be marked as "front" and searchable either in suggestions or full-text search. This is mandatory for proper search operation. These entries have custom indexing data for full-text search, and their content are a very minimal HTML which redirects immediately to the URL with the fragment.

We already know this HTML redirect is kinda a tweak, and should be better supported by the ZIM standard with "real" redirects. Maybe these entries (and their additional future redirect metadata) could be used as well by readers (and especially kiwix-serve) to display proper fragment.

Sample link(entry which could already be displayed / bookmarked today without any ZIM modification (but reader has no idea about how to find them for now): https://library.kiwix.org/viewer#canadian_prepper_winterprepping_en/index/winter_apocalype_survival_must_have_item-z7Ro

benoit74 commented 2 months ago

Edit: fix first comment to mention that drawback is identical in both alternatives.