chatchat-space / Langchain-Chatchat

Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and Llama) RAG and Agent app with langchain
Apache License 2.0
31.62k stars 5.51k forks source link

执行python init_database.py --recreate-vs错误 #3694

Closed 1431551850 closed 4 months ago

1431551850 commented 6 months ago

使用的milvus,langchain_community使用0.0.19

Traceback (most recent call last):
  File "D:\pythonproject\Chatchat0.2.10\init_database.py", line 107, in <module>
    folder2db(kb_names=args.kb_name, mode="recreate_vs", embed_model=args.embed_model)
  File "D:\pythonproject\Chatchat0.2.10\server\knowledge_base\migrate.py", line 128, in folder2db
    files2vs(kb_name, kb_files)
  File "D:\pythonproject\Chatchat0.2.10\server\knowledge_base\migrate.py", line 113, in files2vs
    kb.add_doc(kb_file=kb_file, not_refresh_vs_cache=True)
  File "D:\pythonproject\Chatchat0.2.10\server\knowledge_base\kb_service\base.py", line 131, in add_doc
    doc_infos = self.do_add_doc(docs, **kwargs)
  File "D:\pythonproject\Chatchat0.2.10\server\knowledge_base\kb_service\milvus_kb_service.py", line 83, in do_add_doc
    ids = self.milvus.add_documents(docs)
  File "D:\pythonproject\Chatchat0.2.10\venv\lib\site-packages\langchain_core\vectorstores.py", line 119, in add_documents
    return self.add_texts(texts, metadatas, **kwargs)
  File "D:\pythonproject\Chatchat0.2.10\venv\lib\site-packages\langchain_community\vectorstores\milvus.py", line 520, in add_texts
    assert isinstance(
AssertionError: A list of valid ids are required when auto_id is False.
dosubot[bot] commented 6 months ago

Hey there, @1431551850! Great to see you diving into another challenge. Let's tackle this one together. 🚀

The error you're encountering is due to the auto_id parameter being set to False, which requires a list of valid IDs to be provided for the documents being added. This is indicated by the AssertionError in the add_texts function within the langchain_community\vectorstores\milvus.py file.

To address this issue, you have a few options:

  1. Adjust the auto_id Setting: If possible, set auto_id = True to allow Milvus to automatically generate IDs for the documents. This adjustment should be made in the initialization of Milvus within your project's codebase, specifically where the Milvus vector store is being configured.

  2. Provide a List of IDs: If you prefer or need to use custom IDs (with auto_id set to False), ensure that your code is correctly generating or retrieving a list of IDs for the documents before they are added to the database.

  3. Check Code Implementation: Review the implementation of the folder2db function, particularly how it handles document IDs when mode="recreate_vs". Ensure it aligns with your ID management strategy (auto-generated vs. custom IDs).

  4. Update Dependencies: Make sure all related dependencies, especially those for Milvus and Langchain, are up to date. Compatibility issues between versions could also lead to this error.

If after trying these steps the issue persists, providing more details about your usage of the folder2db function and the context in which the error occurs could help in diagnosing the problem further.

Sources

#### About Dosu This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot.
1431551850 commented 6 months ago

KeyError: 'pk',降级到langchain-community-0.0.19

SuperTangW commented 5 months ago

报错KeyError: 'pk的时候,langchain-community本身就是这个0.0.19 版本,降级到langchain-community-0.0.19是什么意思呢,您是不是写错了

tanpengxiong commented 4 months ago

KeyError: 'pk',降级到langchain-community-0.0.19

怎么降级呀。我也遇到这个问题了。我是改源码langchain_community\vectorstores\milvus.py,将auto_id = False改为auto_id = True。然后就不再出现这个问题,但是又出现新的问题:KeyError: 'pk'

danny-zhu commented 4 months ago

server.knowledge_base.kb_service.milvus_kb_service.MilvusKBService._load_milvus函数中初始化milvus的时候指定参数auto_id=True,如下:

def _load_milvus(self): self.milvus = Milvus(embedding_function=EmbeddingsFunAdapter(self.embed_model), collection_name=self.kb_name, connection_args=kbs_config.get("milvus"), index_params=kbs_config.get("milvus_kwargs")["index_params"], search_params=kbs_config.get("milvus_kwargs")["search_params"], auto_id=True )