PaddlePaddle / PaddleNLP

👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.
https://paddlenlp.readthedocs.io
Apache License 2.0
12.11k stars 2.94k forks source link

[Question]: 使用paddlenlp提供的语义检索系统,为什么在demo:semantic_search_example.py展示下,query没有返回结果 #7308

Closed dingidng closed 11 months ago

dingidng commented 1 year ago

请提出你的问题

python examples/semantic-search/semantic_search_example.py --device gpu

在windows和linux系统下都遇到这个问题了,项目链接参考: https://github.com/PaddlePaddle/PaddleNLP/blob/develop/pipelines/examples/semantic-search/Neural_Search.md

得到结果 Query: 亚马逊河流的介绍

Querying: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 857.73it/s] Ranking: 0it [00:00, ?it/s]

Query: 亚马逊河流的介绍

Query: 期货交易手续费指的是什么?

Query: dev621.txt,五笔

fymaplefish commented 1 year ago

貌似paddle-pipelines包中一行代码有误,找到site-packages/pipelines/document_stores/faiss.py,按如下操作即可: 源代码:vector_ids_for_query = [str(vectorid) + "" + index for vector_id in vector_id_matrix[0] if vector_id != -1] 修改成:vector_ids_for_query = [str(vector_id) for vector_id in vector_id_matrix[0] if vector_id != -1] 适应版本:paddle-pipelines==0.6.1