kiwix / kiwix-tools

Command line Kiwix tools: kiwix-serve, kiwix-manage, ...
https://download.kiwix.org/release/kiwix-tools/
GNU General Public License v3.0
442 stars 85 forks source link

[Regression]Search and Auto Suggestion Broken for Chinese in 3.6.0 Release #657

Closed lovenemesis closed 8 months ago

lovenemesis commented 8 months ago

Thanks for new 3.6.0 release by the year-end! However, after deploying it on my Raspberry Pi, I quickly notice the "Search" and "Auto Suggestion" feature stop working for Chinese, possible other CJK languages, too.

$ sudo kiwix-serve --version
kiwix-tools 3.6.0

libkiwix 13.0.0
+ libzim 9.0.0
+ libxapian 1.4.23
+ libcurl 7.67.0
+ libmicrohttpd 0.9.76
+ libz 1.2.12
+ libicu 73.2.0
+ libpugixml 0.12.0

libzim 9.0.0
+ libzstd 1.5.2
+ liblzma 5.2.6
+ libxapian 1.4.23
+ libicu 73.2.0

Below is what I can while trying to search any article containing character "瑞" in "wikivoyage_zh_all":

截图 2024-01-12 17-02-57

A couple of entries popped up. But if I selected the last one, which should give me all pages containing character "瑞", I got below saying nothing was found:

Screenshot 2024-01-12 at 17-04-13 Search 瑞

Also, if trying to search anything with two or more characters, like "瑞典", even the auto-suggestion stopped working:

截图 2024-01-12 17-18-57

Screenshot 2024-01-12 at 17-20-12 Search 瑞典

This issue consistently persists on Firefox and GNOME Web on Linux, as well as Firefox on Android. If using "wikivoyage_en_all" or other English language based ZIM, no such issue were observed. Search and suggestion works as expected.

Revert kiwix-tools to 3.5.0 release will immediately return the behavior normal.

截图 2024-01-12 17-27-36

$ sudo kiwix-serve --version
kiwix-tools 3.5.0

libkiwix 12.0.0
+ libzim 8.2.0
+ libxapian 1.4.22
+ libcurl 7.67.0
+ libmicrohttpd 0.9.76
+ libz 1.2.12
+ libicu 58.2.0
+ libpugixml 0.12.0

libzim 8.2.0
+ libzstd 1.5.2
+ liblzma 5.2.6
+ libxapian 1.4.22
+ libicu 58.2.0
Jaifroid commented 8 months ago

Probably the same underlying issue as that explored in https://github.com/kiwix/kiwix-android/issues/3587. However, there may be Kiwix-Serve-specific aspects of this to consider.

kelson42 commented 8 months ago

It's a (known) problem with the ZIM file, we should use the latest libzim to create Mediawiki ZIM files. Should work perfectly with Gutenberg for example.

wdscxsj commented 8 months ago

@kelson42 Is it possible to easily tell if a ZIM file is properly updated? The current all-maxi version of Wikipedia Chinese (2024-01) is still buggy. Will there be a "new and improved" message or something on the ZIM download page?