infiniflow / infinity

The AI-native database built for LLM applications, providing incredibly fast hybrid search of dense vector, sparse vector, tensor (multi-vector), and full-text
https://infiniflow.org
Apache License 2.0
2.44k stars 266 forks source link

[Question]: Why do I need to delete and rebuild the index every time you perform a full-text search? #1620

Closed xxzhang0927 closed 1 month ago

xxzhang0927 commented 1 month ago

Is there an existing issue for the same bug?

Version or Commit ID

v0.3.0

Other environment information

No response

Actual behavior and How to reproduce it

def add_index(): table.drop_index("my_index")

索引是现场创建的

res = table.create_index(
        "my_index",
        [
            infinity.index.IndexInfo("content", infinity.index.IndexType.FullText, [infinity.index.InitParameter("ANALYZER", "chinese")]),
        ],
        infinity.common.ConflictType.Error,
    )

def query_index(): res = table.show_index("my_index") print(res)

def query_full_text():

question = "间接服务角色"

question = "社会工作者 小燕 肢体障碍人士 就业 创业 成长计划 项目执行 资源筹措者 角色 反思 重建自信 创业园区 咨询服务 微信公众号"
# question = "社会工作者小燕肢体障碍人士就业创业成长计划项目执行资源筹措者角色反思重建自信创业园区咨询服务微信公众号"
# question = "10岁女孩小丽,父亲因为犯罪行为被关进监狱,母亲在她很小的时候就离家出走,她由爷爷奶奶照顾。一年前爷爷去世,奶奶72岁,脚残疾,小丽身体状况很差,经常生病,学习困难,性格内向,经常被同学欺负。她从不会把这些情况告诉老师,怕给老师惹麻烦,社会工作者可以运用理论来帮助这个小女孩。\nA.社会支持,B.人本主义,C精神分析,D.认知行为"
qb_result = (
    table.output(["content", "_score"])
    .match("content",question,)
    .to_pl()
)
print(f"question: {question}")
print(qb_result)

if name == 'main': add_index() query_full_text()

If I don't use add_index() The result is empty

Expected behavior

No response

Additional information

No response

yuzhichang commented 1 month ago

Don't need to delete and rebuild the index every time before performing a full-text search.

Fix your script to build index once and search multi times. For example,

if name == 'main':
    add_index()
    query_full_text()
    query_full_text()
    query_full_text()

The script doesn't delete index at the end. So another script can search directly without building index again:

if name == 'main':
    query_full_text()
    query_full_text()
    query_full_text()
xxzhang0927 commented 1 month ago

Why do I comment add_index and then execute it after restarting, the result is empty?How do I create a persistent index image

yuzhichang commented 1 month ago

Create index per the spec, do nothing if it already exist:

res = table.create_index(
"my_index",
[
infinity.index.IndexInfo("content", infinity.index.IndexType.FullText, [infinity.index.InitParameter("ANALYZER", "chinese")]),
],
infinity.common.ConflictType.Ignore,
)
xxzhang0927 commented 1 month ago

or not image

yangzq50 commented 1 month ago

Can you show the full baitong_query.py ?

warran2 commented 1 month ago

I also have the same problem.

JinHai-CN commented 1 month ago

I also have the same problem.

You can search after creating an index. After restart the server or reconnect infinity instance, you don't need to create the index any more.

Would you please provide your python scripts, to help me understand your problem?

warran2 commented 1 month ago

thanks,is had post a new issue

yuzhichang commented 1 month ago

Replaced with #1691

JinHai-CN commented 1 month ago

Fixed by #1698