AlistGo / alist

🗂️A file list/WebDAV program that supports multiple storages, powered by Gin and Solidjs. / 一个支持多存储的文件列表/WebDAV程序,使用 Gin 和 Solidjs。
https://alist.nn.ci
GNU Affero General Public License v3.0
43.16k stars 5.59k forks source link

use ngram parser for full-text index #4998

Open xhofe opened 1 year ago

xhofe commented 1 year ago

Please make sure of the following things

Description of the feature / 需求描述

Use the ngram parser for full-text indexing in versions 5.7 and newer

Suggested solution / 实现思路

CREATE FULLTEXT INDEX idx_search_nodes_name ON search_nodes(name) WITH PARSER ngram;

Additional context / 附件

foxxorcat commented 1 year ago

看上去挺不错,不知道会不会有存在这两个问题

2975 MATCH (name) AGAINST (? IN BOOLEAN MODE)

2844 MATCH (name) AGAINST (? IN NATURAL LANGUAGE MODE)

foxxorcat commented 1 year ago

不对,好像默认创建的就是 ngram,除非指定为 mecab。

xhofe commented 1 year ago

不对,好像默认创建的就是 ngram,除非指定为 mecab。

不是吧 我看官网的描述

The built-in MySQL full-text parser uses the white space between words as a delimiter to determine where words begin and end, which is a limitation when working with ideographic languages that do not use word delimiters. To address this limitation, MySQL provides an ngram full-text parser that supports Chinese, Japanese, and Korean (CJK). The ngram full-text parser is supported for use with InnoDB and MyISAM.

默认使用空格分割

xhofe commented 1 year ago

看上去挺不错,不知道会不会有存在这两个问题

2975 MATCH (name) AGAINST (? IN BOOLEAN MODE)

2844 MATCH (name) AGAINST (? IN NATURAL LANGUAGE MODE)

应该可以比默认的要准确(默认使用空格的话