Open suntong opened 3 years ago
Let's build a Full-Text Search engine
https://artem.krylysov.com/blog/2020/07/28/lets-build-a-full-text-search-engine/
This is the kind of tools that I'm talking about. However, because it is English based, the Inverted Index it builds is not capable of handling Chinese
Maybe bleve is better for the use case
Indeed, this is what I'm currently working on
https://github.com/suntong/doc-search
and I'm almost finished (lacks the Chinese search yet).
Description
All example that I saw are string-based searches. However, can
riot
somehow be used as/for a file-based search tool?Basically it'll be just like
grep
, but using its persistent index to speed up the searches, while supporting 中文分词 at the same time.The application scenario is that I have a huge collection of files in Chinese, thousands of them, thus I need something to search through them quickly, with the help of the pre-built indexes, as the content of the files will not be change (or very rarely), but more and more files are added daily. I haven't found any tools that does a good job in Chinese content search yet.
Is it possible? if so, sample code appreciated.
Thanks