openzim / mwoffliner

Mediawiki scraper: all your wiki articles in one highly compressed ZIM file
https://www.npmjs.com/package/mwoffliner
GNU General Public License v3.0
275 stars 72 forks source link

Wiktionary translations can't be shown #1033

Open ghost opened 4 years ago

ghost commented 4 years ago

@danielzgtg commented on Feb 23, 2020, 3:21 AM UTC:

Describe the bug

The translations in Wiktionary cannot be viewed anymore on the latest ZIMs. They start collapsed and do not expand when tapped. This applies to translations, conjugations, and collapsible tables in general.

This happens with:

This does not happen with and things worked fine with wiktionary_en_all_nopic_2017-08.zim

Expected behavior

To be able to see the translations.

Steps to reproduce the behavior:

  1. Tap on "Library" in the overflow menu
  2. Open wiktionary_en_all_maxi_2020-01.zim
  3. Open the definition for "wiki"

Screenshots

Actual behavior:

Screenshot_20200222-191642

Example of expected behavior in older ZIM:

Screenshot_20200222-191715

Environment

This issue was moved by kelson42 from kiwix/kiwix-android#1796.

ghost commented 4 years ago

@kelson42 commented on Feb 23, 2020, 7:14 PM UTC:

@danielzgtg Thank you for your bug report, this is a bug in the content itself, moving the issue to the right repository.

LakmaNeha commented 4 years ago

@kelson42 I am interested in taking up this issue. Can I work on this?

kelson42 commented 4 years ago

@LakmaNeha Thx, would be great

LakmaNeha commented 4 years ago

@kelson42 This issue is not only with translations, but also affects all the sections which uses navFrames. Below image is one of affected section image After some digging, I found out that the toggle behaviour of the navFrames is served by the extension ext.gadget.defaultVisibilityToggles(link). Although If you observe the startup script, the defaultVisibilityToggles extension is being registered. I doubt that the extension is being registered or not for whatever reason, because the event handlers for navFrames > navHead are not being attached, hence the unexpected behaviour. Can you give me some pointers on how to debug this?

Thanks.

kelson42 commented 4 years ago

@LakmaNeha Thank you very much for your last comment, we will try to fix the problem rapidly.

Jaifroid commented 4 years ago

Just to add that this seems to be a CSS issue. Kiwix JS Windows, in its standard mode (which applies its own locally cached copy of Wikimedia CSS, not always 100% updated) is able to display these translations and inflections. But when the setting to use the CSS from the ZIM is applied, it is no longer able to display them. Please note that Kiwix JS Windows does not run JavaScript from the ZIM, which is what leads me to believe that the issue is with the CSS.

A rule needs to be added to the override CSS provided by mwoffliner to display these sections in an initially open state. In general, all sections should always be initially open, and then they should be closed by JavaScript. This ensures that clients that do not execute JS-in-the-ZIM can always display all the information on a page. See #962 for extensive discussion of a closely related issue.

LakmaNeha commented 4 years ago

Please note that Kiwix JS Windows does not run JavaScript from the ZIM, which is what leads me to believe that the issue is with the CSS.

Hi @Jaifroid , Thanks for the response. First of all, sorry for not being clear about the environment. I am using Kiwix for macOS, Is it the same for kiwix for macOS as well?

Secondly,

A rule needs to be added to the override CSS provided by mwoffliner to display these sections in an initially open state.

Inspecting the site css

image I found out that there is already a rule existing which makes the sections visible when JS is not supported and applies display: none property if the JS is supported. You can verify this by disabling the javascript and reload the page wiki page, you will observe by default all the translation and inflections are open. That is the reason I suspected JS.

Jaifroid commented 4 years ago

Yes, you're right that there is that rule in the css file with the snappy filename:

-/s/css_modules/skins.minerva.base.reset|skins.minerva.content.styles|ext.cite.style|site.styles|mobile.app.pagestyles.android|mediawiki.page.gallery.styles|mediawiki.skinning.content.parsoid.css

But the .client-js class has not been added by JS, it is already in the <body> element of the html file:

image

The class .NavFrame is on the parent div, and the class .NavContent is on the hidden div, with the result that the div containing the translations table has display: none:

image

As I say, this is in Kiwix JS Windows, with the client set to use the CSS from the ZIM, and without executing any JavaScript in the ZIM. Therefore, it is definitely not JS that is hiding the div, it is the CSS rule you have found. The confusion here, I think, is that you perfectly reasonably assume (from the name) that the .client-js class has been added to the <body> dynamically, but in fact it is already in the raw HTML that comes straight from the ZIM.

Jaifroid commented 4 years ago

A fix for this would be to change the CSS rule in that file to:

image

i.e., add details:not([open]) to that rule. The translation table would then be visible until the JS executed onload removes the open, which is how all the other details-summary tags work.

kelson42 commented 4 years ago

The main question for this kind of bug, is "Why it does not work out-of-the-box". MWoffliner is conceived to mirror js_modules, css, jss and use the proper js_modules in each page... so what is done wrong here?

Jaifroid commented 4 years ago

@kelson42 The code for opening and closing sections is manipulated by mwoffliner, and the history of this is quite long. See #633, #677 and #962. Mwoffliner wraps details - summary tags around content that should be opened and closed with a click, adds a polyfill for browsers that do not support details-summary, and adds its own JavaScript for closing all sections when the page is first loaded. In principle, all sections should be open until the JS specifically closes headings other than the top-level heading. However, in the case of the Translations and Inflection sections of Wiktionary pages, the CSS rule cited above is hiding the sections despite the fact that the details tag wrapped around it is clearly marked as "open":

image

So we need to override that CSS rule, so that the section is visible until our inserted JS hides it by removing the "open" attribute. The best place to do that is probably not in the file I mention above, but in the custom override CSS file that mwoffliner adds (maybe it's called "inserted_style.css" or "inserted_style_mobile.css").

LakmaNeha commented 4 years ago

@Jaifroid so instead of fixing this by overriding a CSS rule, shouldn't we fix the root cause here, which is: "Why client-js class has been added already in the zim file?" which may raise some other problems as well. Ideally by default, client-nojs must be added to the body, and replacing it by client-js or leaving it as it is must be taken care by the JS. That way we provide the default behaviour even if JS is not supported.

But it is done exactly in the reverse way: client-js is replaced by client-nojs in the Startup script code which replaces client-js with client-nojs if JS is not supported.

image

What do you think?

Jaifroid commented 4 years ago

@LakmaNeha Yes, that sounds like it could work, though it would need careful testing for unintended consequences. I think just removing client-js would deactivate a number of the hiding rules, and then the startup script can add client-js if the reader supports JS.

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

kelson42 commented 4 years ago

@bakshiutkarsha @Jaifroid If I understand right, we have our <details> nodes in the DOM for the sections heading (we control that part of the HTML) and we have <details> node coming from the HTML itself we retrieve. And our JS code for the section collapsing/uncollapsing interfer here with the content code? If this is that, please secure our section code never interfer with content code by setting the necessary class/id/... attributes.

bakshiutkarsha commented 4 years ago

@bakshiutkarsha @Jaifroid If I understand right, we have our <details> nodes in the DOM for the sections heading (we control that part of the HTML) and we have <details> node coming from the HTML itself we retrieve. And our JS code for the section collapsing/uncollapsing interfer here with the content code? If this is that, please secure our section code never interfer with content code by setting the necessary class/id/... attributes.

So, the problem was the interference of the custom section code that we have written, now I have made the some changes on the basis of class="NavFrame", it is working now.

kelson42 commented 4 years ago

@bakshiutkarsha Your solution is not good. Secure our code does not interfer by modifying things without relying on custom content. What will happen tomorrow if a wiki editor decides to call this « navbar2 » in place of « navbar »?

bakshiutkarsha commented 4 years ago

@bakshiutkarsha @Jaifroid If I understand right, we have our <details> nodes in the DOM for the sections heading (we control that part of the HTML) and we have <details> node coming from the HTML itself we retrieve. And our JS code for the section collapsing/uncollapsing interfer here with the content code? If this is that, please secure our section code never interfer with content code by setting the necessary class/id/... attributes.

We do control our details node in DOM for the section heading but there are NO details coming from the HTML we receive, so the solution you are proposing won't work IMO.

In the online version, when the page loads, some block of js code is making this style override

.client-js .NavFrame .NavContent {
    display: none;
}

with this display: block which is not present in the offline version.

Also, there is a click event with function VisibilityToggles getting registered which is not present in the offline version too. So if we want it to behave exactly like the online version, these need to be answered.

@Jaifroid Do you have any insights about the click event registration and which part of code is getting called at the very first time which is overriding the style?

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

kelson42 commented 3 years ago

@MananJethwani I believe this bug might be fixed now?

MananJethwani commented 3 years ago

@kelson42 I will check and let you know

MananJethwani commented 3 years ago

@kelson42 this is not fixed looks like there is something else that's causing the problem. will try to solve it

MananJethwani commented 3 years ago

@kelson42 this is again related to incomplete module list #1391, modules like 'ext.gadget.VisibilityToggles' and 'ext.gadget.defaultVisibilityToggles' needs to be included and other base modules namely 'jquery.cookie', 'mediawiki.storage', 'mediawiki.cookie' needs to be added to config.ts

kelson42 commented 3 years ago

@MananJethwani Can you please exactly what is wrongkyndelivered by Mediawiki API so the upstream bug is clear?

MananJethwani commented 3 years ago

@kelson42 for the toggle button to appear Wiktionary is using ext.gadget.VisibilityToggles which itself depends on ext.gadget.defaultVisibilityToggles and parts of mw object from 'jquery.cookie', 'mediawiki.storage', 'mediawiki.cookie' which are not mentioned in the module list.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

cpina commented 3 years ago

I'm setting up Kiwix to be used onboard via kiwix-serve. Is there any easy-ish workaround that could be done on my side? I've read the comments and I didn't spot any easy way to have a work around while is getting fixed.

Thank you!

Jaifroid commented 3 years ago

@cpina Please read the thread above, and you'll see various fixes proposed, like removing a piece of css ('client-js'). Also, you could check your particular ZIM on https://pwa.kiwix.org, which last time I checked was capable of displaying translations in articles that have them. It still works in an old 2020-01 English Wiktionary, but I haven't downloaded the latest ZIMs. Below is from the article foot. If you don't see this in pwa.kiwix.org, try to switch to Desktop style in Config. If you still don't see the translations,, check the page source HTML by opening DevTools and inspecting the page:

image

cpina commented 3 years ago

@Jaifroid : I might have landed to the wrong GitHub repository (after a few linked tickets earlier today).

I'm using (well, usually behind an nginx but just to keep it simple): /usr/local/bin/kiwix-serve /mnt/data1/kiwix/wiktionary_en_all_maxi_2021-03.zim --port 8099

The first JavaScript error mentions: http://127.0.0.1:8099/wiktionary_en_all_maxi_2021-03/-/mw/jsConfigVars.js

The content of this file is "("

I guess that my wiktionary_en_all_maxi_2021-03.zim is incorrect (and probably generated by mwoffliner?) and that I should generate it again with a fix for this problem? Or can I fix it somehow without re-generating it? (this is what I was trying to do but I don't see how, or not with kiwix-serve. If I see the source HTML of the page I see the translations).

As a "crazy" idea: because I have kiwix behind an nginx I could serve a "fixed" file for http://127.0.0.1:8099/wiktionary_en_all_maxi_2021-03/-/mw/jsConfigVars.js , if this was the problem (it wasn't mentioned in the thread, I just see that looks strange in my installation).

I'm sorry for the questions and not spending time trying to solve it: I'm new in Kiwix ecosystem and trying to see how things fit together. I'm preparing a bit of a last minute expedition and I thought that having Wikipedia offline onboard will be very useful... and a dictionary as well :-) Thanks very much!

Jaifroid commented 3 years ago

The first JavaScript error mentions: http://127.0.0.1:8099/wiktionary_en_all_maxi_2021-03/-/mw/jsConfigVars.js

The content of this file is "("

Actually that's not an issue with your ZIM. All ZIMs for the last few years have had invalid mw JS files. They cause errors in console, but don't affect the loading of the page.

The issue is a CSS one, and if you can catch the html being served and remove the client-js class from the html tag, that might help, or else there are other more subtle solutions suggested above.

Do you need to serve your ZIM files via Kiwix Serve as upposed to using the various apps available for different platforms? Is the idea to "broadcast" a portal that any device can access, without each one having to have the ZIM files on-board?

What pwa.kiwix.org does is to serve its own versions of some CSS files (this is for speed), and because of that it is able to show translations. It also doesn't execute JavaScript in the ZIM, which helps. You could try intercepting some of the JS.

But if you've only got one device accessing the ZIM(s), you might be better off using one of the apps that can display the translations. They're available also for Linux, though there's currently a certificate problem with some Kiwix apps (in the process of being resolved) which could block installation on some systems.

cpina commented 3 years ago

Hi @Jaifroid - In the vessel we have an internal intranet with different resources and one of them is Kiwix with Wikipedia, Wiktionary and some TED science and technology.

All of this to say: users connect with their laptops using their browser from their cabins, labs, the office, etc. kiwix-serve is really a good fit for me. There is very limited and unstable internet connection.

So, what I've done for now is to change a .css file (and pardon me, CSS not my usual things to deal with) doing this in the nginx reverse proxy:

       location /kiwix/wiktionary_en_all_maxi_2021-03/-/mw/skins.minerva.base.reset|skins.minerva.content.styles|ext.cite.style|site.styles|mobile.app.pagestyles.android|mediawiki.page.gallery.styles|mediawiki.skinning.content.parsoid.css {
               alias /etc/nginx/overwrites/parsoid.css;
       }

And the change in the file I added a comment here: .NavFrame .NavContent{/*display:none*/}

The result is that all the translations are displayed all the time. I can live with this.

Apparently it's possible using nginx and the module ngx_http_sub_module (https://nginx.org/en/docs/http/ngx_http_sub_module.html) to rewrite the HTML. But I'm using an nginx in a Docker container and will not have this module and I would need to dig a bit better to find the HTML/CSS to change dynamically (I have limited time for this :-) )

If you knew of another way to serve the .zim files to be accessed via a browser (without internet connection) I can look at that. But I'm pretty happy so far with kiwix-serve.

Jaifroid commented 3 years ago

@cpina Kiwix Serve is perfect for your use case, and there isn't a better solution for that case. Glad you've managed to patch the problem with hidden translations. Much better for users to see some extra material than not to have it accessible at all.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

Jaifroid commented 2 years ago

A user reports that these translations are still not viewable in the latest wiktionary_en_all_maxi_2022-01.zim .

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

ynikitenko commented 1 year ago

Many thank for this fix! It was the major software gifts to me before the New Year!

However, when I just bought a new Android tablet this morning, I downloaded Kiwix 3.4.5 and the most recent English wiktionary (from 29.09.2022, with pictures). Translations are still not shown, unfortunately! However, I can see them in a German Wiktionary with pictures from 2023-01-18.

Maybe the recent English Wiktionary data file should be updated for that?..

kelson42 commented 1 year ago

We will make a new release of MWoffliner this week and then new ZIM files will have to be created with it...

danielzgtg commented 1 year ago

By now, I barely use Kiwix Android. The old Kiwix Linux desktop is deprecated and doesn't have zstd, and the new one conflicts with many parts of my system setup. I don't use Chrome anymore thus Kiwix JS Chrome after I switched over to Firefox.

Kiwix JS Firefox is perfect. It opens a service worker webpage. It even supports the Dark Reader extension which looks better than the built-in themes. Other extensions like Tampermonkey are supported as well and I could have used it to inject a style to show the translations. It does have a glitch with network/history but oh well I'll just refresh. But these days, I just use F12 in Firefox as a workaround to show the translations when I need them.

danielzgtg commented 1 year ago

Here is a Tampermonkey userscript (only kiwix-js that is in ServiceWorker mode is supported) to work around this for older zims:

// ==UserScript==
// @name         Kiwix unhide
// @namespace    http://tampermonkey.net/
// @version      0.1
// @description  try to take over the world!
// @author       You
// @match        https://moz-extension.kiwix.org/*
// @grant GM_addStyle
// ==/UserScript==

(function() {
    'use strict';
GM_addStyle(`
.NavContent {
display: block !important;
}
`);
})();
kelson42 commented 1 year ago

@danielzgtg I have no clue neither what you comment means nor why you keep commenting on a closed ticket. If anything is still not ok with mwoffliner, please open a new ticket.

danielzgtg commented 1 year ago

That comment was just a workaround for people that are using outdated zims. Newer versions of mwoffliner are fine, I just thought that snippet would be useful while people wait.

Jaifroid commented 1 year ago

According to a user on Reddit, and I corroborate, there is a regression with this issue in wiktionary_en_all_maxi_2023-06.zim. Or the issue may be with Kiwix Desktop (version x64_2.3.1-2, latest available in release directory). The translations show fine in in Kiwix JS PWA. See screenshots below of the article "morning" (with a small m) in both Kiwix Desktop (top) and the PWA (bottom). Translations can't be opened in Kiwix Desktop.

image

image

ewtoombs commented 11 months ago

It is still broken in wiktionary_en_all_nopic_2023-07, as served on library.kiwix.org and on my local kiwix-serve instance.

https://library.kiwix.org/viewer#wiktionary_en_all_nopic_2023-07/A/chair

ewtoombs commented 11 months ago

Here is a more targetted workaround:

// ==UserScript==
// @name         kiwix-wiktionary-toggle-navcontent
// @version      0.0
// @description  Fixes toggling of all NavContent instances in kiwix wiktionary.
// @include      http://localhost:1024/*
// ==/UserScript==

for (nh of document.getElementsByClassName("NavHead")) {
  nh.onclick = (e) => {
    ncs = e.target.parentElement.getElementsByClassName("NavContent")[0].style;
    if (ncs.display == "block") {
      ncs.display = "none";
    }
    else {
      ncs.display = "block";
    }
  };
}

This re-enables the original buttons being used to toggle the translations. It assumes kiwix-serve is running on port 1024. It is only tested on the hidden translations.