kiwix / kiwix-android

Kiwix for Android
https://android.kiwix.org
GNU General Public License v3.0
889 stars 446 forks source link

Opening contents by swiping right on main page shows "No content headers found" #620

Closed arpank10 closed 5 years ago

arpank10 commented 6 years ago

Bug Report

Environment

The Bug:

Steps to reproduce:

  1. Open a zim file
  2. Swipe right to open content headers on main page of Zim file.
  3. It shows "No content headers found" followed by the content headers.

    • What should be the behaviour from your point of view? How do you expect the service to work? "No content headers found" should not be displayed when content headers are present.
kelson42 commented 6 years ago

The title of the page should be displayed... I have here a recent version of Kiwix which seems to work... so I suspect this bug has been introduced recently or this is depending of the content currently loaded.

mhutti1 commented 6 years ago

@kelson42 likely content dependent which ZIM file is this?

arpank10 commented 6 years ago

@mhutti1 Wikipedia Maths(122 MB) and Wiktionary(5.7 MB)

arpank10 commented 6 years ago

I will check for few other Zim files.

1raghavmahajan commented 6 years ago

Can I get assigned to this?

arpank10 commented 6 years ago

Sure.

julianharty commented 6 years ago

This message seems to be overloaded i.e. used in several circumstances. In my view it's not always the best choice.

As a separate, orthogonal topic, we can choose to add headers to content that doesn't currently have any so this message isn't displayed - valid content headings would then be displayed instead. (The current heading this is used by the app when the content lacks a title which is gleaned from the content when the content has a valid HTML <H1> tag.

I'll list several examples first then add the screenshots so you can see the effects. I'll add some additional observations on editing the content in another comment as this one will be massive once I've added the screenshots.

Ray Charles (one of our often used ZIMs for testing)

Bollywood ZIM (one I use quite a bit for testing as it's fairly small but not as plain as the Ray Charles layouts, etc.

And finally for these examples, our very own Kiwix Help page.

Ray Charles examples

raycharleszim_homepage_has_onesection

raycharleszim_has_sections_butnoh1

Bollywood examples

bollywoodzimhomepage_hassections_noh1

bollywoodzimhomepage_hassections_noh1_menu

Kiwix Help example

kiwix_help_lacks_headings

julianharty commented 6 years ago

As promised here are some additional comments and observations.

At least for Bollywood, I've found what seems to be the source page. https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Film/Offline_Bollywood I notice this was edited by one of the project team.

Perhaps other ZIM files also have their 'homepage' online where they can be edited.

For the Bollywood homepage, switching to edit mode reveals how the layout was constructed:

<div style="font-size:185%;">Welcome to Bollywood</div>
<div style="font-size:135%;">All Wikipedia articles about Bollywood and beyond, from Wikipedia, powered by Kiwix.</div>

rather than using the markup for a H1. It's easy to edit this page to use the markup to make the first line into a H1 heading but replacing the divs with = e.g. =Welcome to Bollywood= The HTML that's generated converts the wiki markup of = to HTML markup of <H1> etc. And the Kiwix Android app converts the H1 text into the title of the article and the section title in the right-drawer.

I've saved this change BTW so when you visit the page you'll need to review the edit history.

So my first observation is we may be able to fix the source article(s) to add what will become the title. I'll add others as additional comments here.

julianharty commented 6 years ago

Next suggestion... The current message only seems pertinent when there are no section headings, it seems confusing in all other circumstances. So if there are any section headings (which are generated by Kiwix parsing the HTML of the article) then I suggest we don't use the current message. Instead we could either put nothing (no text / a blank or empty string) or - if we can obtain it the title of the article, (perhaps based on parsing the link used in the ZIM file?)

julianharty commented 6 years ago

Last of my current train of thoughts on this topic...

How about we try to format the help page so it has section headings and the equivalent of a H1 heading. I don't understand how the XML is converted / rendered as if it were HTML content (see the HierarchyView screenshot below) and presumably we're somehow parsing the contents expecting it to be in HTML format..? screen shot 2018-04-11 at 20 11 45

https://github.com/kiwix/kiwix-android/blob/master/app/src/main/java/org/kiwix/kiwixmobile/KiwixWebViewClient.java (loads the help page)

And does the parsing happen here? https://github.com/kiwix/kiwix-android/blob/master/app/src/main/java/org/kiwix/kiwixmobile/utils/DocumentParser.java although from running the code in the debugger I don't think the Help content is actually parsed currently, however the DocumentParser code is executed, presumably it can't find any HTML content when the Help View is being loaded. If not, perhaps we could create a custom parser to generate the sections for the Help page?

RohanBh commented 6 years ago

@julianharty the help page is added to the WebView using the addView() method. The addView() method belongs to the ViewGroup class and is simply inherited by WebView. So, the XML is not converted to HTML to load the help page, it is inserted as a child to the WebView. The following things happen when help is opened:

  1. help.html is loaded in the WebView.
  2. Later, in the WebViewClient's on onPageFinished() method, if the help is not currently a child of the WebView, it is added to the view.
  3. If another URL is loaded in the WebView which doesn't correspond to the Help page (help.html), then the help page is removed as a child to the WebView.

Therefore, when the documentParser.js runs on the WebView (via javascript injection), it tries to find Heading tags in the help.html file. It then notifies the Document Parser.java Class. The information of all the headings found in the HTML is received in the parse() method. Since there are no heading tags in it, the right drawer displays the No Content Header found message. With that said, we can modify the help.html to support the content header detection. If we change

<title>Help</title>

to

<h1>Help</h1>

then atleast, the help page will show the respective content header.

julianharty commented 6 years ago

@RohanBh I'd reached a similar conclusion of what happens in the code. I did find it odd that the Android view is added as a child of the WebView but on balance - why not, they are both Views after all...

I do like your simple, clean suggestion to add <H1> tags to the current minuscule help.html source. However, how about adding a H1 as well as the current title (which is in the head element of the HTML rather than the body)? As you say, at least the right drawer will then say 'Help' rather than the current unhelpful message.

Here's a mock up of adding an H1 rather than changing the current title to be the H1:

./app/src/main/assets/help.html

<!DOCTYPE html>
<html>
<head>
    <meta charset="UTF-8"/>
    <title>Help</title>
</head>
<body>
<h1>Help</h1>
</html>

One consideration I expect you understand well is the aspect of localisation of the contents of the help. The English term 'Help' is probably quite well recognised globally but not by all our users, I expect. Perhaps we have to be a little cleverer in finding a solution that localises the text that gets displayed that doesn't get affected when displaying any articles that happen to have Help as their H1 heading e.g. https://en.wikipedia.org/wiki/Help and the translated equivalents that refer to the Beatles, etc.

RohanBh commented 6 years ago

Yes, I agree. What I suggested wasn't even going to work (not a regular web coder here). I never considered the problems of having a large userbase. It is important that a user sees everything in the app in his/her local language. We need not worry about other articles here, as many articles come in different languages and Kiwix lets you download them in your language. So, this problem won't occur for popular articles that are available in many languages. Although, I am not sure how to provide a similar solution for the help page. With the current solution, the help page itself will get converted to the local language (because each language specific string resource will be used), but the content header will always show "help". We could probably handle the specific case of help page by adding an XML string resource with help value. Then loading the language specific string at runtime in the parse() method like this:

if (title.equals("Help")) {
    title = getResources().getString(R.string.help_text);
}
stale[bot] commented 5 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

macgills commented 5 years ago

I think there is nothing to do on this ticket. I cannot reproduce the original issue, it seems the zim files have been correctly updated so they have sections/h1 headings. @kelson42 recommend closing.