3.5、使用:报错:
多功能对话报错:
An error occurred during streaming
2024-11-05 14:22:50.570 | ERROR | chatchat.server.api_server.openai_routes:generator:105 - openai request error: An error occurred during streaming
RAG对话报错:
failed to access embed model 'quentinz/bge-large-zh-v1.5': Error raised by inference endpoint: HTTPConnectionPool(host='127.0.0.1', port=11434): Max retries exceeded with url: /api/embeddings (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f31a39f7460>: Failed to establish a new connection: [Errno 111] Connection refused'))
一、系统环境 1.1、操作系统:centos9 1.2、python版本:3.10 1.3、[Langchain-Chatchat]版本:0.3.1.3 1.4、emb:bge-large-zh v1.5 1.5、llm: glm-4-9b-chat
二、Xinference: 2.1、创建Xinference虚拟环境(python3 -m venv venv_xinference) 2.2、pip安装,如下: Package Version
accelerate 1.1.0 aiofiles 23.2.1 aioprometheus 23.12.0 annotated-types 0.7.0 anyio 4.6.2.post1 async-timeout 5.0.0 bcrypt 4.2.0 certifi 2024.8.30 cffi 1.17.1 charset-normalizer 3.4.0 click 8.1.7 cloudpickle 3.1.0 cryptography 43.0.3 distro 1.9.0 ecdsa 0.19.0 exceptiongroup 1.2.2 fastapi 0.115.4 ffmpy 0.4.0 filelock 3.16.1 fsspec 2024.10.0 gradio 5.4.0 gradio_client 1.4.2 h11 0.14.0 httpcore 1.0.6 httpx 0.27.2 huggingface-hub 0.26.2 idna 3.10 Jinja2 3.1.4 jiter 0.7.0 joblib 1.4.2 markdown-it-py 3.0.0 MarkupSafe 2.1.5 mdurl 0.1.2 modelscope 1.19.2 mpmath 1.3.0 networkx 3.4.2 numpy 2.1.3 nvidia-cublas-cu12 12.4.5.8 nvidia-cuda-cupti-cu12 12.4.127 nvidia-cuda-nvrtc-cu12 12.4.127 nvidia-cuda-runtime-cu12 12.4.127 nvidia-cudnn-cu12 9.1.0.70 nvidia-cufft-cu12 11.2.1.3 nvidia-curand-cu12 10.3.5.147 nvidia-cusolver-cu12 11.6.1.9 nvidia-cusparse-cu12 12.3.1.170 nvidia-ml-py 12.560.30 nvidia-nccl-cu12 2.21.5 nvidia-nvjitlink-cu12 12.4.127 nvidia-nvtx-cu12 12.4.127 openai 1.53.0 orjson 3.10.11 packaging 24.1 pandas 2.2.3 passlib 1.7.4 peft 0.13.2 pillow 11.0.0 pip 24.3.1 psutil 6.1.0 pyasn1 0.6.1 pycparser 2.22 pydantic 2.9.2 pydantic_core 2.23.4 pydub 0.25.1 Pygments 2.18.0 python-dateutil 2.9.0.post0 python-jose 3.3.0 python-multipart 0.0.12 pytz 2024.2 PyYAML 6.0.2 quantile-python 1.1 regex 2024.9.11 requests 2.32.3 rich 13.9.4 rsa 4.9 ruff 0.7.2 safehttpx 0.1.1 safetensors 0.4.5 scikit-learn 1.5.2 scipy 1.14.1 semantic-version 2.10.0 sentence-transformers 3.2.1 setuptools 65.5.0 shellingham 1.5.4 six 1.16.0 sniffio 1.3.1 sse-starlette 2.1.3 starlette 0.41.2 sympy 1.13.1 tabulate 0.9.0 tblib 3.0.0 threadpoolctl 3.5.0 tiktoken 0.8.0 timm 1.0.11 tokenizers 0.20.2 tomlkit 0.12.0 torch 2.5.1 torchvision 0.20.1 tqdm 4.66.6 transformers 4.46.1 triton 3.1.0 typer 0.12.5 typing_extensions 4.12.2 tzdata 2024.2 urllib3 2.2.3 uvicorn 0.32.0 uvloop 0.21.0 websockets 12.0 xinference 0.16.2 xoscar 0.4.0
2.3、启动运行Xinference: Xinference正常使用,使用xinfernce内置对话功能,能够正常进行对话;.
三、[Langchain-Chatchat] 3.1、创建[Langchain-Chatchat]虚拟环境(python3 -m venv venv_Langchain) 3.2、pip安装,如下: Package Version
aiohappyeyeballs 2.4.3 aiohttp 3.10.10 aiosignal 1.3.1 altair 4.2.2 annotated-types 0.7.0 antlr4-python3-runtime 4.9.3 anyio 4.6.2.post1 async-timeout 4.0.3 attrs 24.2.0 backoff 2.2.1 beautifulsoup4 4.12.3 blinker 1.8.2 Brotli 1.1.0 cachetools 5.5.0 certifi 2024.8.30 cffi 1.17.1 chardet 5.2.0 charset-normalizer 3.4.0 click 8.1.7 coloredlogs 15.0.1 contourpy 1.3.0 cryptography 43.0.3 cycler 0.12.1 dataclasses-json 0.6.7 deepdiff 8.0.1 Deprecated 1.2.14 deprecation 2.1.0 distro 1.9.0 effdet 0.4.1 emoji 2.14.0 entrypoints 0.4 et_xmlfile 2.0.0 exceptiongroup 1.2.2 faiss-cpu 1.7.4 faiss-gpu 1.7.2 Faker 30.8.2 fastapi 0.109.2 favicon 0.7.0 filelock 3.16.1 filetype 1.2.0 flatbuffers 24.3.25 fonttools 4.54.1 frozenlist 1.5.0 fsspec 2024.10.0 gitdb 4.0.11 GitPython 3.1.43 greenlet 3.1.1 h11 0.14.0 h2 4.1.0 hpack 4.0.0 htbuilder 0.6.2 httpcore 1.0.6 httpx 0.27.2 huggingface-hub 0.26.2 humanfriendly 10.0 hyperframe 6.0.1 idna 3.10 iopath 0.1.10 jieba 0.42.1 Jinja2 3.1.4 jiter 0.7.0 joblib 1.4.2 jsonpatch 1.33 jsonpath-python 1.0.6 jsonpointer 3.0.0 jsonschema 4.23.0 jsonschema-specifications 2024.10.1 kiwisolver 1.4.7 langchain 0.1.17 langchain-chatchat 0.3.1.3 langchain-community 0.0.36 langchain-core 0.1.53 langchain-experimental 0.0.58 langchain-openai 0.0.6 langchain-text-splitters 0.0.2 langchainhub 0.1.14 langdetect 1.0.9 langsmith 0.1.139 layoutparser 0.3.4 loguru 0.7.2 lxml 5.3.0 Markdown 3.7 markdown-it-py 3.0.0 markdownify 0.13.1 markdownlit 0.0.7 MarkupSafe 3.0.2 marshmallow 3.23.1 matplotlib 3.9.2 mdurl 0.1.2 memoization 0.4.0 more-itertools 10.5.0 mpmath 1.3.0 multidict 6.1.0 mypy-extensions 1.0.0 nest-asyncio 1.6.0 networkx 3.1 nltk 3.8.1 numexpr 2.10.1 numpy 1.24.4 nvidia-cublas-cu12 12.4.5.8 nvidia-cuda-cupti-cu12 12.4.127 nvidia-cuda-nvrtc-cu12 12.4.127 nvidia-cuda-runtime-cu12 12.4.127 nvidia-cudnn-cu12 9.1.0.70 nvidia-cufft-cu12 11.2.1.3 nvidia-curand-cu12 10.3.5.147 nvidia-cusolver-cu12 11.6.1.9 nvidia-cusparse-cu12 12.3.1.170 nvidia-nccl-cu12 2.21.5 nvidia-nvjitlink-cu12 12.4.127 nvidia-nvtx-cu12 12.4.127 omegaconf 2.3.0 onnx 1.17.0 onnxruntime 1.15.1 openai 1.53.0 opencv-python 4.10.0.84 openpyxl 3.1.4 orderly-set 5.2.2 orjson 3.10.11 packaging 23.2 pandas 1.5.3 pathlib 1.0.1 pdf2image 1.17.0 pdfminer.six 20231228 pdfplumber 0.11.4 pikepdf 9.4.0 pillow 10.4.0 pip 24.3.1 portalocker 2.10.1 prometheus_client 0.21.0 propcache 0.2.0 protobuf 4.25.5 pyarrow 18.0.0 pyclipper 1.3.0.post6 pycocotools 2.0.8 pycparser 2.22 pydantic 2.7.4 pydantic_core 2.18.4 pydantic-settings 2.3.4 pydeck 0.9.1 Pygments 2.18.0 PyJWT 2.8.0 pymdown-extensions 10.12 PyMuPDF 1.23.26 PyMuPDFb 1.23.22 PyMySQL 1.1.1 pyparsing 3.2.0 pypdf 5.1.0 pypdfium2 4.30.0 pytesseract 0.3.13 python-dateutil 2.9.0.post0 python-decouple 3.8 python-docx 1.1.2 python-dotenv 1.0.1 python-iso639 2024.10.22 python-magic 0.4.27 python-multipart 0.0.9 pytz 2024.2 PyYAML 6.0.2 rank-bm25 0.2.2 RapidFuzz 3.10.1 rapidocr-onnxruntime 1.3.25 referencing 0.35.1 regex 2024.9.11 requests 2.31.0 requests-toolbelt 1.0.0 rich 13.9.4 rpds-py 0.20.1 ruamel.yaml 0.18.6 ruamel.yaml.clib 0.2.12 safetensors 0.4.5 scipy 1.14.1 setuptools 65.5.0 shapely 2.0.6 simplejson 3.19.3 six 1.16.0 smmap 5.0.1 sniffio 1.3.1 socksio 1.0.0 soupsieve 2.6 SQLAlchemy 2.0.36 sse-starlette 1.8.2 st-annotated-text 4.0.1 starlette 0.36.3 streamlit 1.34.0 streamlit-aggrid 1.0.5 streamlit-antd-components 0.3.1 streamlit-camera-input-live 0.2.0 streamlit-card 1.0.2 streamlit-chatbox 1.1.12.post4 streamlit-embedcode 0.1.2 streamlit-extras 0.4.2 streamlit-faker 0.0.3 streamlit-feedback 0.1.3 streamlit-image-coordinates 0.1.9 streamlit-keyup 0.2.4 streamlit_modal 0.1.0 streamlit-option-menu 0.3.12 streamlit-paste-button 0.1.2 streamlit-toggle-switch 1.0.2 streamlit-vertical-slider 2.5.5 strsimpy 0.2.1 sympy 1.13.1 tabulate 0.9.0 tenacity 8.5.0 tiktoken 0.8.0 timm 1.0.11 tokenizers 0.20.2 toml 0.10.2 toolz 1.0.0 torch 2.5.1 torchvision 0.20.1 tornado 6.4.1 tqdm 4.66.6 transformers 4.46.1 triton 3.1.0 types-requests 2.32.0.20241016 typing_extensions 4.12.2 typing-inspect 0.9.0 unstructured 0.11.8 unstructured-client 0.25.9 unstructured-inference 0.7.18 unstructured.pytesseract 0.3.13 urllib3 2.2.3 uvicorn 0.32.0 validators 0.34.0 watchdog 6.0.0 websockets 13.1 wrapt 1.16.0 xinference-client 0.13.3 yarl 1.17.1
3.3、model_settings.yaml
模型配置项
默认选用的 LLM 名称
DEFAULT_LLM_MODEL: autodl-tmp-glm-4-9b-chat-id
默认选用的 Embedding 名称
DEFAULT_EMBEDDING_MODEL: bge-large-zh-v1.5
AgentLM模型的名称 (可以不指定,指定之后就锁定进入Agent之后的Chain的模型,不指定就是 DEFAULT_LLM_MODEL)
Agent_MODEL: ''
默认历史对话轮数
HISTORY_LEN: 3
大模型最长支持的长度,如果不填写,则使用模型默认的最大长度,如果填写,则为用户设定的最大长度
MAX_TOKENS:
LLM通用对话参数
TEMPERATURE: 0.7
支持的Agent模型
SUPPORT_AGENT_MODELS:
LLM模型配置,包括了不同模态初始化参数。
model
如果留空则自动使用 DEFAULT_LLM_MODELLLM_MODEL_CONFIG: preprocess_model: model: '' temperature: 0.05 max_tokens: 4096 history_len: 10 prompt_name: default callbacks: false llm_model: model: '' temperature: 0.9 max_tokens: 4096 history_len: 10 prompt_name: default callbacks: true action_model: model: '' temperature: 0.01 max_tokens: 4096 history_len: 10 prompt_name: ChatGLM3 callbacks: true postprocess_model: model: '' temperature: 0.01 max_tokens: 4096 history_len: 10 prompt_name: default callbacks: true image_model: model: sd-turbo size: 256*256
模型加载平台配置
平台名称
platform_name: xinference
平台类型
可选值:['xinference', 'ollama', 'oneapi', 'fastchat', 'openai', 'custom openai']
platform_type: xinference
openai api url
api_base_url: http://127.0.0.1:9997/v1
api key if available
api_key: EMPTY
API 代理
api_proxy: ''
该平台单模型最大并发数
api_concurrencies: 5
是否自动获取平台可用模型列表。设为 True 时下方不同模型类型可自动检测
auto_detect_model: false
该平台支持的大语言模型列表,auto_detect_model 设为 True 时自动检测
llm_models: []
该平台支持的嵌入模型列表,auto_detect_model 设为 True 时自动检测
embed_models: []
该平台支持的图像生成模型列表,auto_detect_model 设为 True 时自动检测
text2image_models: []
该平台支持的多模态模型列表,auto_detect_model 设为 True 时自动检测
image2text_models: []
该平台支持的重排模型列表,auto_detect_model 设为 True 时自动检测
rerank_models: []
该平台支持的 STT 模型列表,auto_detect_model 设为 True 时自动检测
speech2text_models: []
该平台支持的 TTS 模型列表,auto_detect_model 设为 True 时自动检测
text2speech_models: []
MODEL_PLATFORMS:
3.4、启动:正常
3.5、使用:报错: 多功能对话报错: An error occurred during streaming 2024-11-05 14:22:50.570 | ERROR | chatchat.server.api_server.openai_routes:generator:105 - openai request error: An error occurred during streaming
RAG对话报错: failed to access embed model 'quentinz/bge-large-zh-v1.5': Error raised by inference endpoint: HTTPConnectionPool(host='127.0.0.1', port=11434): Max retries exceeded with url: /api/embeddings (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f31a39f7460>: Failed to establish a new connection: [Errno 111] Connection refused'))