yanqiangmiffy / Chinese-LangChain

中文langchain项目|小必应,Q.Talk,强聊,QiangTalk
2.67k stars 323 forks source link

无法使用“加载知识库”功能。 #2

Open xmxoxo opened 1 year ago

xmxoxo commented 1 year ago

无法使用“加载知识库”功能

启动服务后只能使用模型推理;

尝试点击:“加载知识库”,报错:”初始化知识库未成功加载“

报错代码:

Error in faiss::FileIOReader::FileIOReader(const char*) at /project/faiss/faiss/impl/io.cpp:67: Error: 'f' failed: could not open cache/index.faiss for reading: No such file or directory

查看源码,并没有发现cache目录;

文件上传功能

左下角的“上传文件到知识库”,文件上传后会保存到docs目录下,但如果是中文名的文件,上传到linux上会乱码;

文件上传后不知道如何使用,仍然是无法加载知识库;

yanqiangmiffy commented 1 year ago

已修复

config = LangChainCFG()
application = LangChainApplication(config)

application.source_service.init_source_vector()
lanybass commented 1 year ago

https://huggingface.co/spaces/ChallengeHub/Chinese-LangChain 在线demo也是加载失败

yandun72 commented 1 year ago

我也是

nmww commented 1 year ago

python app.py No compiled kernel found. Compiling kernels : /root/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b-int4-qe/977d9df4cfae6b7a756e07698483872c5c070eee/quantization_kernels_parallel.c Compiling gcc -O3 -fPIC -pthread -fopenmp -std=c99 /root/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b-int4-qe/977d9df4cfae6b7a756e07698483872c5c070eee/quantization_kernels_parallel.c -shared -o /root/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b-int4-qe/977d9df4cfae6b7a756e07698483872c5c070eee/quantization_kernels_parallel.so Kernels compiled : /root/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b-int4-qe/977d9df4cfae6b7a756e07698483872c5c070eee/quantization_kernels_parallel.so Load kernel : /root/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b-int4-qe/977d9df4cfae6b7a756e07698483872c5c070eee/quantization_kernels_parallel.so Setting CPU quantization kernel threads to 16 Using quantization cache Applying quantization to glm layers No sentence-transformers model found with name /root/.cache/torch/sentence_transformers/GanymedeNil_text2vec-large-chinese. Creating a new one with MEAN pooling. 姚明.txt ╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ │ /root/Chinese-LangChain/app.py:25 in │ │ │ │ 22 │ │ 23 config = LangChainCFG() │ │ 24 application = LangChainApplication(config) │ │ ❱ 25 application.source_service.init_source_vector() │ │ 26 │ │ 27 │ │ 28 def get_file_list(): │ │ │ │ /root/Chinese-LangChain/clc/source_service.py:39 in init_source_vector │ │ │ │ 36 │ │ │ if doc.endswith('.txt'): │ │ 37 │ │ │ │ print(doc) │ │ 38 │ │ │ │ loader = UnstructuredFileLoader(f'{self.docs_path}/{doc}', mode="element │ │ ❱ 39 │ │ │ │ doc = loader.load() │ │ 40 │ │ │ │ docs.extend(doc) │ │ 41 │ │ self.vector_store = FAISS.from_documents(docs, self.embeddings) │ │ 42 │ │ self.vector_store.save_local(self.vector_store_path) │ │ │ │ /root/miniconda3/lib/python3.8/site-packages/langchain/document_loaders/unstructured.py:61 in │ │ load │ │ │ │ 58 │ │ │ 59 │ def load(self) -> List[Document]: │ │ 60 │ │ """Load file.""" │ │ ❱ 61 │ │ elements = self._get_elements() │ │ 62 │ │ if self.mode == "elements": │ │ 63 │ │ │ docs: List[Document] = list() │ │ 64 │ │ │ for element in elements: │ │ │ │ /root/miniconda3/lib/python3.8/site-packages/langchain/document_loaders/unstructured.py:93 in │ │ _get_elements │ │ │ │ 90 │ │ super().init(mode=mode, unstructured_kwargs) │ │ 91 │ │ │ 92 │ def _get_elements(self) -> List: │ │ ❱ 93 │ │ from unstructured.partition.auto import partition │ │ 94 │ │ │ │ 95 │ │ return partition(filename=self.file_path, self.unstructured_kwargs) │ │ 96 │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯ ModuleNotFoundError: No module named 'unstructured.partition.auto' root@autodl-container-ebe411a150-49ec68c6:~/Chinese-LangChain# python app.py No compiled kernel found. Compiling kernels : /root/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b-int4-qe/977d9df4cfae6b7a756e07698483872c5c070eee/quantization_kernels_parallel.c Compiling gcc -O3 -fPIC -pthread -fopenmp -std=c99 /root/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b-int4-qe/977d9df4cfae6b7a756e07698483872c5c070eee/quantization_kernels_parallel.c -shared -o /root/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b-int4-qe/977d9df4cfae6b7a756e07698483872c5c070eee/quantization_kernels_parallel.so Kernels compiled : /root/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b-int4-qe/977d9df4cfae6b7a756e07698483872c5c070eee/quantization_kernels_parallel.so Load kernel : /root/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b-int4-qe/977d9df4cfae6b7a756e07698483872c5c070eee/quantization_kernels_parallel.so Setting CPU quantization kernel threads to 16 Using quantization cache Applying quantization to glm layers No sentence-transformers model found with name /root/.cache/torch/sentence_transformers/GanymedeNil_text2vec-large-chinese. Creating a new one with MEAN pooling. 王治郅.txt ╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ │ /root/Chinese-LangChain/app.py:36 in │ │ │ │ 33 │ │ 34 │ │ 35 file_list = get_file_list() │ │ ❱ 36 application.source_service.init_source_vector() │ │ 37 │ │ 38 def upload_file(file): │ │ 39 │ if not os.path.exists("docs"): │ │ │ │ /root/Chinese-LangChain/clc/source_service.py:39 in init_source_vector │ │ │ │ 36 │ │ │ if doc.endswith('.txt'): │ │ 37 │ │ │ │ print(doc) │ │ 38 │ │ │ │ loader = UnstructuredFileLoader(f'{self.docs_path}/{doc}', mode="element │ │ ❱ 39 │ │ │ │ doc = loader.load() │ │ 40 │ │ │ │ docs.extend(doc) │ │ 41 │ │ self.vector_store = FAISS.from_documents(docs, self.embeddings) │ │ 42 │ │ self.vector_store.save_local(self.vector_store_path) │ │ │ │ /root/miniconda3/lib/python3.8/site-packages/langchain/document_loaders/unstructured.py:61 in │ │ load │ │ │ │ 58 │ │ │ 59 │ def load(self) -> List[Document]: │ │ 60 │ │ """Load file.""" │ │ ❱ 61 │ │ elements = self._get_elements() │ │ 62 │ │ if self.mode == "elements": │ │ 63 │ │ │ docs: List[Document] = list() │ │ 64 │ │ │ for element in elements: │ │ │ │ /root/miniconda3/lib/python3.8/site-packages/langchain/document_loaders/unstructured.py:93 in │ │ _get_elements │ │ │ │ 90 │ │ super().init(mode=mode, unstructured_kwargs) │ │ 91 │ │ │ 92 │ def _get_elements(self) -> List: │ │ ❱ 93 │ │ from unstructured.partition.auto import partition │ │ 94 │ │ │ │ 95 │ │ return partition(filename=self.file_path, self.unstructured_kwargs) │ │ 96 │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯ ModuleNotFoundError: No module named 'unstructured.partition.auto' root@autodl-container-ebe411a150-49ec68c6:~/Chinese-LangChain#