chatchat-space / Langchain-Chatchat

Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and Llama) RAG and Agent app with langchain
Apache License 2.0
31.35k stars 5.47k forks source link

why python3.8 init_database.py --recreate-vs has problem #3516

Closed HZPHuangZePeng closed 6 months ago

HZPHuangZePeng commented 6 months ago

(py38) $ python3.8 init_database.py --recreate-vs /home/daas/.local/lib/python3.8/site-packages/torch/cuda/init.py:138: UserWarning: CUDA initialization: The NVIDIA driver on your system is too old (found version 11070). Please update your GPU driver by downloading and installing a new version from the URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to: https://pytorch.org to install a PyTorch version that has been compiled with your version of the CUDA driver. (Triggered internally at ../c10/cuda/CUDAFunctions.cpp:108.) return torch._C._cuda_getDeviceCount() > 0 recreating all vector stores 2024-03-26 09:24:30,570 - faiss_cache.py[line:92] - INFO: loading vector store in 'samples/vector_store/bge-large-zh' from disk. 2024-03-26 09:24:31,002 - SentenceTransformer.py[line:66] - INFO: Load pretrained SentenceTransformer: /opt/program/projects/Know/bge-large-zh 2024-03-26 09:24:31,668 - embeddings_api.py[line:39] - ERROR: You seem to have cloned a repository without having git-lfs installed. Please install git-lfs and run git lfs install followed by git lfs pull in the folder you cloned. AttributeError: 'NoneType' object has no attribute 'conjugate'

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "init_database.py", line 107, in folder2db(kb_names=args.kb_name, mode="recreate_vs", embed_model=args.embed_model) File "/opt/program/projects/Langchain-Chatchat/server/knowledge_base/migrate.py", line 121, in folder2db kb.create_kb() File "/opt/program/projects/Langchain-Chatchat/server/knowledge_base/kb_service/base.py", line 81, in create_kb self.do_create_kb() File "/opt/program/projects/Langchain-Chatchat/server/knowledge_base/kb_service/faiss_kb_service.py", line 51, in do_create_kb self.load_vector_store() File "/opt/program/projects/Langchain-Chatchat/server/knowledge_base/kb_service/faiss_kb_service.py", line 28, in load_vector_store return kb_faiss_pool.load_vector_store(kb_name=self.kb_name, File "/opt/program/projects/Langchain-Chatchat/server/knowledge_base/kb_cache/faiss_cache.py", line 102, in load_vector_store vector_store = self.new_vector_store(embed_model=embed_model, embed_device=embed_device) File "/opt/program/projects/Langchain-Chatchat/server/knowledge_base/kb_cache/faiss_cache.py", line 60, in new_vector_store vector_store = FAISS.from_documents([doc], embeddings, normalize_L2=True,distance_strategy="METRIC_INNER_PRODUCT") File "/home/daas/.local/lib/python3.8/site-packages/langchain_core/vectorstores.py", line 508, in from_documents return cls.from_texts(texts, embedding, metadatas=metadatas, *kwargs) File "/home/daas/.local/lib/python3.8/site-packages/langchain_community/vectorstores/faiss.py", line 965, in from_texts embeddings = embedding.embed_documents(texts) File "/opt/program/projects/Langchain-Chatchat/server/knowledge_base/kb_service/base.py", line 439, in embed_documents return normalize(embeddings).tolist() File "/opt/program/projects/Langchain-Chatchat/server/knowledge_base/kb_service/base.py", line 37, in normalize norm = np.linalg.norm(embeddings, axis=1) File "<__array_function__ internals>", line 200, in norm File "/home/daas/.local/lib/python3.8/site-packages/numpy/linalg/linalg.py", line 2541, in norm s = (x.conj() x).real TypeError: loop of ufunc does not support argument 0 of type NoneType which has no callable conjugate method (py38) $

dosubot[bot] commented 6 months ago

Hey @HZPHuangZePeng, great to see you back! Hope you're doing well.

The error you're encountering seems to be related to the embed_documents function in the base.py file. This function is supposed to generate embeddings for a list of documents, but it seems to be returning None instead of a list of embeddings. This None value is then passed to the normalize function, which expects a list of embeddings and not None, hence the error 'NoneType' object has no attribute 'conjugate'.

The error message also suggests that there might be an issue with the bge-large-zh model. It seems that the model files might not have been correctly downloaded or loaded. The error message You seem to have cloned a repository without having git-lfs installed. Please install git-lfs and run git lfs install followed by git lfs pull in the folder you cloned. suggests that the large file storage (LFS) files, which likely include the model files, were not correctly downloaded when you cloned the repository.

Here are some steps you can take to troubleshoot this issue:

  1. Check the embed_documents function: Make sure that this function is correctly generating the embeddings. You can add some debug prints to check the output of this function.

  2. Verify the input to embed_documents: Make sure that the input to this function, texts, is not None or empty. If it is, this could be the reason why the function is returning None.

  3. Check the bge-large-zh model: Make sure that this model is correctly loaded and functioning. The error message suggests that there might be an issue with this model.

  4. Install git-lfs: If you haven't already, install git-lfs and run git lfs install followed by git lfs pull in the folder you cloned. This should download the LFS files, which likely include the model files.

  5. Check the versions of your dependencies: Make sure that the versions of numpy and other dependencies are compatible with Langchain-Chatchat v0.2.10.

You can also refer to similar issues in the Langchain-Chatchat repository for potential solutions:

I hope this helps! If you have any more information or if there's anything else I can do to assist, please let me know.

Sources

#### About Dosu This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot.
zRzRzRzRzRzRzR commented 6 months ago

模型没全