flarum / framework

Simple forum software for building great communities.
http://flarum.org/
6.3k stars 835 forks source link

Can't search anything in chinese or japanese #2003

Closed pluveto closed 4 years ago

pluveto commented 4 years ago

Bug Report

Current Behavior Can't search anything in chinese

Steps to Reproduce My website: https://www.acgmuse.com/ In the home page, you can see a post "四季折々に揺蕩いて/After the Rainまふまふ 最" Click search box, type "揺蕩", you can see nothing found: https://www.acgmuse.com/?q=%E6%8F%BA%E8%95%A9 And here is a post "亡灵序曲" https://www.acgmuse.com/d/1214 But search "亡灵序曲", you can not get this post: https://www.acgmuse.com/?q=%E4%BA%A1%E7%81%B5%E5%BA%8F%E6%9B%B2

Expected Behavior I expect: There is a post titled "亡灵序曲", and with content "这是一些帖子内容", and I search "亡灵" || "亡灵序曲" || "序曲" || "这是" || "内容", the post should have been given.

Screenshots No

Environment

Flarum core 0.1.0-beta.11.1
PHP version: 7.3.13
Loaded extensions: Core, date, libxml, openssl, pcre, zlib, filter, hash, pcntl, readline, Reflection, SPL, session, standard, bz2, calendar, ctype, curl, dom, mbstring, fileinfo, ftp, gd, gettext, iconv, json, exif, mysqlnd, PDO, Phar, SimpleXML, sockets, sqlite3, tokenizer, xml, xmlwriter, xsl, mysqli, pdo_mysql, pdo_sqlite, wddx, xmlreader, zip, Zend OPcache
+----------------------------------+------------------+------------------------------------------+
| Flarum Extensions                |                  |                                          |
+----------------------------------+------------------+------------------------------------------+
| ID                               | Version          | Commit                                   |
+----------------------------------+------------------+------------------------------------------+
| flarum-approval                  | v0.1.0-beta.8    |                                          |
| flarum-emoji                     | v0.1.0-beta.10   |                                          |
| flarum-flags                     | v0.1.0-beta.10   |                                          |
| flarum-likes                     | v0.1.0-beta.9    |                                          |
| flarum-lock                      | v0.1.0-beta.9    |                                          |
| flarum-markdown                  | v0.1.0-beta.10   |                                          |
| flarum-mentions                  | v0.1.0-beta.10   |                                          |
| flarum-statistics                | v0.1.0-beta.9    |                                          |
| flarum-sticky                    | v0.1.0-beta.9    |                                          |
| flarum-subscriptions             | v0.1.0-beta.9    |                                          |
| flarum-suspend                   | v0.1.0-beta.10   |                                          |
| flarum-tags                      | v0.1.0-beta.11   |                                          |
| csineneo-lang-simplified-chinese | v0.1.0-beta.10.9 |                                          |
| reflar-doorman                   | 0.1.4            |                                          |
| pluveto-bbcode-bilibili          | dev-master       | 5a8f362216362b3a4d211ed35ec3bc346fda341f |
| pluveto-bbcode-netease           | dev-master       | 911081dfaa6000382261a12c92e739fbe8de384f |
| flagrow-upload                   | 0.7.1            |                                          |
| fof-user-directory               | 0.3.3            |                                          |
| flarum-auth-github               | v0.1.0-beta.9    |                                          |
+----------------------------------+------------------+------------------------------------------+
Base URL: https://www.acgmuse.com
Installation path: /var/www/acgmuse.com
Debug mode: ON
Don't forget to turn off debug mode! It should never be turned on in a production system.

Possible Solution I have no idea

Additional Context No

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. We do this to keep the amount of open issues to a manageable minimum. In any case, thanks for taking an interest in this software and contributing by opening the issue in the first place!

askvortsov1 commented 4 years ago

Right now, flarum uses a custom, but simple searching engine. With changes being proposed in flarum/issue-archive#286, it should be possible to hook up another provider like ElasticSearch which should solve this.

notohiro commented 4 years ago

I figured out the way of searching Japanese words. To use ngram index, I changed to MySQL from MariaDB. And run these commands.

DROP INDEX content ON posts; CREATE FULLTEXT INDEX content ON posts (content) WITH PARSER ngram;

Before

Screen Shot 2020-05-26 at 23 57 29

After

Screen Shot 2020-05-27 at 0 16 17

Searching in browser is showing same results.

pluveto commented 4 years ago

I figured out the way of searching Japanese words. To use ngram index, I changed to MySQL from MariaDB. And run these commands.

DROP INDEX content ON posts; CREATE FULLTEXT INDEX content ON posts (content) WITH PARSER ngram;

Before

Screen Shot 2020-05-26 at 23 57 29

After

Screen Shot 2020-05-27 at 0 16 17

Searching in browser is showing same results.

This needs MySQL 5.7+ and there are still many keywords can not be identified from posts.

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. We do this to keep the amount of open issues to a manageable minimum. In any case, thanks for taking an interest in this software and contributing by opening the issue in the first place!

tankerkiller125 commented 4 years ago

I'm going to go ahead and close this as it's referenced in flarum/issue-archive#203 which is now our primary UTF-8 issue, I don't see the need to have multiple UTF-8 issues open at the same time.