X-D-Lab / LangChain-ChatGLM-Webui

基于LangChain和ChatGLM-6B等系列LLM的针对本地知识库的自动问答
Apache License 2.0
3.13k stars 471 forks source link

本地部署找不到模型 #107

Open Roy202307 opened 1 year ago

Roy202307 commented 1 year ago

(venv) PS D:\python\LangChain-ChatGLM-Webui-master> python app.py No sentence-transformers model found with name C:\Users\Administrator/.cache\torch\sentence_transformers\GanymedeNil_text2vec-base-chinese. Creating a new one with MEAN poolin g. No sentence-transformers model found with name D:\python\LangChain-ChatGLM-Webui-master\model_cache\GanymedeNil/text2vec-base-chinese\GanymedeNil_text2vec-base-chinese. Creati ng a new one with MEAN pooling. Symbol nvrtcGetCUBIN not found in D:\NAVDIA_GPU\Toolkit\CUDA\v11.0\bin\nvrtc64_110_0.dll Symbol nvrtcGetCUBINSize not found in D:\NAVDIA_GPU\Toolkit\CUDA\v11.0\bin\nvrtc64_110_0.dll Symbol cudaLaunchKernel not found in C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common\cudart64_65.dll No compiled kernel found. Compiling kernels : C:\Users\Administrator.cache\huggingface\modules\transformers_modules\THUDM\chatglm-6b-int4\6c5205c47d0d2f7ea2e44715d279e537cae0911f\quantizationkernels parallel.c Compiling gcc -O3 -fPIC -pthread -fopenmp -std=c99 C:\Users\Administrator.cache\huggingface\modules\transformers_modules\THUDM\chatglm-6b-int4\6c5205c47d0d2f7ea2e44715d279e53 7cae0911f\quantization_kernels_parallel.c -shared -o C:\Users\Administrator.cache\huggingface\modules\transformers_modules\THUDM\chatglm-6b-int4\6c5205c47d0d2f7ea2e44715d279e 537cae0911f\quantization_kernels_parallel.so 'gcc' 不是内部或外部命令,也不是可运行的程序 或批处理文件。 Compile default cpu kernel failed, using default cpu kernel code. Compiling gcc -O3 -fPIC -std=c99 C:\Users\Administrator.cache\huggingface\modules\transformers_modules\THUDM\chatglm-6b-int4\6c5205c47d0d2f7ea2e44715d279e537cae0911f\quantiza tion_kernels.c -shared -o C:\Users\Administrator.cache\huggingface\modules\transformers_modules\THUDM\chatglm-6b-int4\6c5205c47d0d2f7ea2e44715d279e537cae0911f\quantization_ke rnels.so 'gcc' 不是内部或外部命令,也不是可运行的程序 或批处理文件。 Compile default cpu kernel failed. Failed to load kernel. Cannot load cpu kernel, don't use quantized model on cpu. Using quantization cache Applying quantization to glm layers The dtype of attention mask (torch.int64) is not bool Running on local URL: http://0.0.0.0:7860 Running on public URL: https://6d7cff42c16ffdbf72.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades (NEW!), check out Spaces: https://huggingface.co/spaces

====================================================================

上面是运行python app.py之后,显示的信息 打开 public URL,显示:模型未成功加载,请重新选择模型后点击"重新加载模型"按钮 这时,不作任何修改,直接点击按钮“重新加载模型”, 控制台又是一大堆提示: No sentence-transformers model found with name C:\Users\Administrator/.cache\torch\sentence_transformers\GanymedeNil_text2vec-base-chinese. Creating a new one with MEAN poolin g. No sentence-transformers model found with name D:\python\LangChain-ChatGLM-Webui-master\model_cache\GanymedeNil/text2vec-base-chinese\GanymedeNil_text2vec-base-chinese. Creati ng a new one with MEAN pooling. No compiled kernel found. Compiling kernels : C:\Users\Administrator.cache\huggingface\modules\transformers_modules\THUDM\chatglm-6b-int4\6c5205c47d0d2f7ea2e44715d279e537cae0911f\quantizationkernels parallel.c Compiling gcc -O3 -fPIC -pthread -fopenmp -std=c99 C:\Users\Administrator.cache\huggingface\modules\transformers_modules\THUDM\chatglm-6b-int4\6c5205c47d0d2f7ea2e44715d279e53 7cae0911f\quantization_kernels_parallel.c -shared -o C:\Users\Administrator.cache\huggingface\modules\transformers_modules\THUDM\chatglm-6b-int4\6c5205c47d0d2f7ea2e44715d279e 537cae0911f\quantization_kernels_parallel.so 'gcc' 不是内部或外部命令,也不是可运行的程序 或批处理文件。 Compile default cpu kernel failed, using default cpu kernel code. Compiling gcc -O3 -fPIC -std=c99 C:\Users\Administrator.cache\huggingface\modules\transformers_modules\THUDM\chatglm-6b-int4\6c5205c47d0d2f7ea2e44715d279e537cae0911f\quantiza tion_kernels.c -shared -o C:\Users\Administrator.cache\huggingface\modules\transformers_modules\THUDM\chatglm-6b-int4\6c5205c47d0d2f7ea2e44715d279e537cae0911f\quantization_ke rnels.so 'gcc' 不是内部或外部命令,也不是可运行的程序 或批处理文件。 Compile default cpu kernel failed. Failed to load kernel. Cannot load cpu kernel, don't use quantized model on cpu. Using quantization cache Applying quantization to glm layers

但是,页面会提示“模型已成功重新加载,可以开始对话” 只要输入信息开始对话,页面就提示ERROR 控制台显示: RuntimeError: Error in __cdecl faiss::FileIOReader::FileIOReader(const char *) at D:\a\faiss-wheels\faiss-wheels\faiss\faiss\impl\io.cpp:68: Error: 'f' failed: could not open faiss_index\index.faiss for reading: No such file or directory

请问这是什么问题呢?

snuffcn commented 11 months ago

和你情况一样

Yanllan commented 7 months ago

由你的报错可见 模型运行在了CPU上,若想让ChatGLM-6B运行在CPU上,请按照下列步骤进行: 1.安装gcc编译器

安装时需要用cpu运行,必须安装gcc与openmp 2.修改配置内容 ChatGLM-6B/quantization.py文件中注释掉from cpm_kernels.kernels.base import LazyKernelCModule, KernelFunction, round_up kernels = Kernel(…)注释掉,替换为kernels =CPUKernel() 把已缓存的.cache目录下文件删掉 例如你的文件地址C:\Users\Administrator.cache\huggingface\modules\transformers_modules\THUDM\chatglm-6b-int4\6c5205c47d0d2f7ea2e44715d279e537cae0911f\quantizationkernels 最后,修改cli_demo.py中的内容model = AutoModel.from_pretrained("THUDM\ChatGLM-6B", trust_remote_code=True).half().cuda()为:model = AutoModel.from_pretrained("THUDM\ChatGLM-6B", trust_remote_code=True).float()

Roy202307 commented 7 months ago

谢谢,我去试试,非常感谢!

---原始邮件--- 发件人: @.> 发送时间: 2024年1月15日(周一) 晚上9:57 收件人: @.>; 抄送: @.**@.>; 主题: Re: [X-D-Lab/LangChain-ChatGLM-Webui] 本地部署找不到模型 (Issue #107)

由你的报错可见 模型运行在了CPU上,若想让ChatGLM-6B运行在COU上,请按照下列步骤进行: 1.安装gcc编译器

安装时需要用cpu运行,必须安装gcc与openmp 2.修改配置内容 ChatGLM-6B/quantization.py文件中注释掉from cpm_kernels.kernels.base import LazyKernelCModule, KernelFunction, round_up kernels = Kernel(…)注释掉,替换为kernels =CPUKernel() 把已缓存的.cache目录下文件删掉 例如你的文件地址C:\Users\Administrator.cache\huggingface\modules\transformers_modules\THUDM\chatglm-6b-int4\6c5205c47d0d2f7ea2e44715d279e537cae0911f\quantizationkernels 最后,修改cli_demo.py中的内容model = AutoModel.from_pretrained("THUDM\ChatGLM-6B", trust_remote_code=True).half().cuda()为:model = AutoModel.from_pretrained("THUDM\ChatGLM-6B", trust_remote_code=True).float()

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>