RUC-NLPIR / FlashRAG

⚡FlashRAG: A Python Toolkit for Efficient RAG Research
https://arxiv.org/abs/2405.13576
MIT License
1.37k stars 113 forks source link

运行simple_pipeline报错 #104

Open jsxyhelu opened 5 days ago

jsxyhelu commented 5 days ago

Generating train split: 15000 examples [00:00, 416735.51 examples/s] Retrieval process: 0%| | 0/1 [00:00<?, ?it/s]Use query: as retreival instruction Retrieval process: 0%| | 0/1 [00:00<?, ?it/s] Traceback (most recent call last): File "/root/FlashRAG/examples/quick_start/simple_pipeline.py", line 40, in output_dataset = pipeline.run(test_data, do_eval=True) File "/root/FlashRAG/flashrag/pipeline/pipeline.py", line 87, in run retrieval_results = self.retriever.batch_search(input_query) File "/root/FlashRAG/flashrag/retriever/retriever.py", line 66, in wrapper results, scores = func(self, query_list, num, True) File "/root/FlashRAG/flashrag/retriever/retriever.py", line 97, in wrapper results, scores = func(self, query_list, num, True) File "/root/FlashRAG/flashrag/retriever/retriever.py", line 171, in batch_search return self._batch_search(*args, **kwargs) File "/root/FlashRAG/flashrag/retriever/retriever.py", line 345, in _batch_search batch_scores, batch_idxs = self.index.search(batch_emb, k=num) File "/root/miniconda3/envs/py310_chat/lib/python3.10/site-packages/faiss/class_wrappers.py", line 329, in replacement_search assert d == self.d AssertionError

从这个错误中,还不能确定问题来源,看上去和faiss有关。

ignorejjj commented 5 days ago

是否使用了配套的retriever model和index文件? 这个错是因为使用的index文件里面的embedding维度和用于检索的检索器embedding维度不匹配

jsxyhelu commented 5 days ago

是否使用了配套的retriever model和index文件? 这个错是因为使用的index文件里面的embedding维度和用于检索的检索器embedding维度不匹配

python simple_pipeline.py --model_path=/root/autodl-tmp/glm-4-9b-chat --retriever_path=/root/autodl-tmp/AI-ModelScope/bge-large-zh-v1___5

命令是这个,请教一下 “配套的retriever model和index文件” 需要在哪里配置?感谢~

ignorejjj commented 5 days ago

提供的index是e5建的,如果要用得用e5-base-v2作为检索器。需要使用其他的检索器需要重建index,参考:

  1. https://github.com/RUC-NLPIR/FlashRAG/blob/main/docs/introduction_for_beginners_zh.md
  2. https://github.com/RUC-NLPIR/FlashRAG/blob/main/docs/building-index.md