Closed sgon00 closed 5 years ago
pg_jieba is a extension of tsvector type for chinese. I believe reference the doc for tsvector is fine. here is the official doc: https://www.postgresql.org/docs/11/textsearch-indexes.html
@jaiminpan thank you very much for the reference and this project.
By reading the official doc, the following is how I use pg_jieba so far:
CREATE TABLE "novelcn" (
"id" bigserial,
"title" text,
"body" text,
fts_index_col tsvector,
PRIMARY KEY ("id")
);
CREATE INDEX idx_novelcn_fts ON novelcn USING GIN (fts_index_col);
CREATE TRIGGER novelcn_tsvectorupdate
BEFORE UPDATE OR INSERT ON novelcn
FOR EACH ROW EXECUTE PROCEDURE tsvector_update_trigger(fts_index_col, 'public.jiebaqry', title, body);
do the search:
SELECT id, title
FROM novelcn
WHERE fts_index_col @@ to_tsquery('jiebaqry', '青云门 & 传说')
ORDER BY id DESC
LIMIT 10;
It works fine, even if I am not sure if this is the best way to do or not. I hope the above sql codes can help other new comers to use pg_jieba when they find this issue.
Btw, many online English and Chinese articles say RUM
is much faster than GIN
, BUT in my testing, RUM
is a little bit slower than GIN
. I don't know why. I will simply use GIN
for now.
Thanks and closing this question.
Hi, I don't find any tutorial or documentation regarding how to use pg_jieba properly in production. For example, how to create index? Should I use
gin
orrum
? I heardrum
is much faster, but when I search online, I don't find anyone using rum index with pg_jieba at all. Thanks a lot for your help.