kiwix / kiwix-tools

Command line Kiwix tools: kiwix-serve, kiwix-manage, ...
https://download.kiwix.org/release/kiwix-tools/
GNU General Public License v3.0
433 stars 85 forks source link

Search regressions in nightly builds of kiwix-serve: (1) punctuation floods the UI (2) Title Search goes to the wrong URL [mitigated by clearing browser cache] #505

Closed holta closed 2 years ago

holta commented 2 years ago
  1. Punctuation floods the search UI, as seen in the screenshot below.

    This is a regression compared to prior versions of kiwix-serve (an example is http://download.kiwix.org/release/kiwix-tools/kiwix-tools_linux-armhf-3.1.2-5.tar.gz which does not have this problem).

  2. Title Search links fail.

    Example: when you click on <b>apple</b> at the top of the search dropdown (screenshot below), it does NOT go to http://192.168.0.182/kiwix/wiktionary_en_simple_all_maxi_2021-12/A/apple

    Instead, it erroneously goes to: http://192.168.0.182/kiwix/search?content=wiktionary_en_simple_all_maxi_2021-12&pattern=apple

    Whereas: Full-Text Search link DOES work, leading to http://192.168.0.182/kiwix/search?content=wiktionary_en_simple_all_maxi_2021-12&pattern=apple+ (which correctly shows 67 search result articles, that happen to mention the word "apple").

image

CONTEXT:

kelson42 commented 2 years ago

@holta Thank you very much for the bug report. We should clarify if this is a problem with the ZIM or with the software.

holta commented 2 years ago

FYI I just clarified and revised the above explanation, providing a bit more context.

kelson42 commented 2 years ago

@holta I can not reproduce the problem with the very same ZIM file, page and version of nightly (but built for x86-64).

image

Tested with both Chrome & FF.

Please secure:

holta commented 2 years ago

Clearing the browser cache worked to solve both problems!

(To keep regular users from getting very confused.)

holta commented 2 years ago

CONTEXT: asking regular users to clear their browser cache every time they search Kiwix — will certainly not work in low-literacy societies especially :smile:

(So if there's anything Kiwix and/or IIAB should do to solve this at the root, together we definitely should!)

holta commented 2 years ago

I forgot to mention a 3rd UX bug, that is also quite common, and likewise very frustrating:

Oftentimes kiwix-serve's search dropdown never appears when you type into the textfield.

No matter how long you wait after typing in a keyword (e.g. "apple").

(In any case, this frustrating end-user dilemma also goes away when the browser cache is cleared. I'll keep an eye on it, to try to understand any underlying patterns.)

kelson42 commented 2 years ago

@holta One topic per ticket. Please open a new ticket if you think there is an other bug.

kelson42 commented 2 years ago

CONTEXT: asking regular users to clear their browser cache every time they search Kiwix — will certainly not work in low-literacy societies especially smile

Agree, there is a weakness in kiwix-serve caching strategy since always. We will fix it, see https://github.com/kiwix/libkiwix/issues/650.

tim-moody commented 2 years ago

Even if the issue is browser cache, I still wonder where the <b>apple</b> got converted to &lt;b&gt;apple

Doesn't seem like something the cache would do. But browsers keep trying to be more 'secure', so perhaps we need to know which version of the browsers are used by @kelson42 and @holta

holta commented 2 years ago

Doesn't seem like something the cache would do.

Confusing indeed. (Certainly clearing the cache made the issue go away, in my testing anyway.)

PS All above tests were done using Chrome 96 as the browser.

veloman-yunkan commented 2 years ago

Even if the issue is browser cache, I still wonder where the <b>apple</b> got converted to &lt;b&gt;apple

@tim-moody Obviously, the way that HTML content is handled by the front-end and back-end of the kiwix-serve suggestion machinery was changed. In this case the new front-end used cached results that have been served by the old backend or something like that.

tim-moody commented 2 years ago

on a nuc with recent but not latest kiwix-serve I search for Clouston's hidrotic ectodermal dysplasia on port 3000 (no proxy) and I get the same thing in the drop down, but if I then press down arrow the text in the search box changes to Clouston's hidrotic ectodermal dysplasia. With up arrow it reverts to the original. I have never done this search before and get the same result on FF, Edge, and Chrome Kiwix is dated June 9, 2021

I speculate that the search box is being populated with the formatted title from the search dropdown with the formatting changed to url safe.

mgautierfr commented 2 years ago

As @veloman-yunkan suggest. It is a problem of compatibility between the frontend and the backend.

Clearing the browser cache worked to solve both problems! Does anybody have an idea how difficult it would be to bypass the browser cache automatically? So that kiwix-serve / kiwix-search stuff is never cached? And where would be the best place to do that?

What we are caching is not the search/suggestion/random results but the front js. If we change the way we return results to the front (and the way it is handled by the front), user need to get the new front version. The best way to do it is probably to version the front js/html/css and add this version in the urls. Then it would be seen as different resources by the browser and it would not use a old version of the front.

holta commented 2 years ago

1) Thank you @veloman-yunkan and @mgautierfr for confirming. That helps a lot!

2) In addition, there are also some "excess punctuation + HTML tags" bugs, even when browser cache is cleared.

@tim-moody tried to give an example above (yesterday) but his posting was not clear. What he meant to say was:

on a nuc with recent but not latest kiwix-serve I search for Clouston's hidrotic ectodermal dysplasia on port 3000 (no proxy) and I get the same thing in the drop down, but if I then press down arrow the text in the search box changes to Clouston&apos;s hidrotic ectodermal dysplasia. With up arrow it reverts to the original. I have never done this search before and get the same result on FF, Edge, and Chrome

Kiwix is dated June 9, 2021

(That's kiwix-tools 3.1.2-5 from http://download.kiwix.org/release/kiwix-tools/kiwix-tools_linux-x86_64-3.1.2-5.tar.gz)

I speculate that the search box is being populated with the formatted title from the search dropdown with the formatting changed to url safe.

kelson42 commented 2 years ago

2. In addition, there are also some "excess punctuation + HTML tags" bugs, even when browser cache is cleared.

Please open one new ticket per problem. Without bug report, they might never be fixed.

holta commented 2 years ago

one new ticket per problem

Indeed this is the original problem described in the subject line of this ticket (excess punctuation, that is accidentally polluting Kiwix search UX).

tim-moody commented 2 years ago

OK. I didn't know what you meant by 'front end' This I understand

What we are caching is not the search/suggestion/random results but the front js. If we change the way we return results to the front (and the way it is handled by the front), user need to get the new front version. The best way to do it is probably to version the front js/html/css and add this version in the urls. Then it would be seen as different resources by the browser and it would not use a old version of the front.

Still odd that I experienced it in private mode on FF and even after I cleared the cache.