xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
https://inference.readthedocs.io
Apache License 2.0
5.37k stars 434 forks source link

Error: Could not instantiate the backend tokenizer #1993

Open LucisBaoshg opened 3 months ago

LucisBaoshg commented 3 months ago

System Info / 系統信息

Distributor ID: Ubuntu Description: Ubuntu 22.04.4 LTS Python 3.11.8

Cuda compilation tools, release 12.1, V12.1.105 Build cuda_12.1.r12.1/compiler.32688072_0 transformers 4.43.3 Package Version


absl-py 2.1.0 accelerate 0.31.0 addict 2.4.0 aiobotocore 2.7.0 aiofiles 23.2.1 aiohttp 3.9.5 aioitertools 0.11.0 aioprometheus 23.12.0 aiosignal 1.3.1 alembic 1.13.2 aliyun-python-sdk-core 2.15.1 aliyun-python-sdk-kms 2.16.3 altair 5.3.0 annotated-types 0.7.0 anthropic 0.28.0 antlr4-python3-runtime 4.9.3 anyio 4.4.0 argon2-cffi 23.1.0 argon2-cffi-bindings 21.2.0 arrow 1.3.0 asttokens 2.4.1 async-lru 2.0.4 async-timeout 4.0.3 attrdict 2.0.1 attrs 23.2.0 audioread 3.0.1 auto_gptq 0.7.1 autoawq 0.2.5 autoawq_kernels 0.0.6 autopage 0.5.2 Babel 2.15.0 bcrypt 4.1.3 beautifulsoup4 4.12.3 bibtexparser 2.0.0b7 bitsandbytes 0.43.1 bleach 6.1.0 boto3 1.28.64 botocore 1.31.64 cdifflib 1.2.6 certifi 2024.6.2 cffi 1.16.0 cfgv 3.4.0 charset-normalizer 3.3.2 chatglm-cpp 0.3.2 chattts 0.1.1 click 8.1.7 cliff 4.7.0 clldutils 3.22.2 cloudpickle 3.0.0 cmaes 0.10.0 cmake 3.29.5 cmd2 2.4.3 colorama 0.4.6 coloredlogs 15.0.1 colorlog 6.8.2 comm 0.2.2 conformer 0.3.2 contourpy 1.2.1 controlnet-aux 0.0.7 crcmod 1.7 cryptography 42.0.8 csvw 3.3.0 cycler 0.12.1 Cython 3.0.10 datasets 2.18.0 debugpy 1.8.2 decorator 5.1.1 defusedxml 0.7.1 diffusers 0.25.0 dill 0.3.8 diskcache 5.6.3 distlib 0.3.8 distro 1.9.0 dlinfo 1.2.1 dnspython 2.6.1 ecdsa 0.19.0 editdistance 0.8.1 einops 0.8.0 einx 0.2.2 email_validator 2.1.1 encodec 0.1.1 executing 2.0.1 fastapi 0.110.3 fastapi-cli 0.0.4 fastjsonschema 2.20.0 ffmpy 0.3.2 filelock 3.14.0 FlagEmbedding 1.2.10 flatbuffers 24.3.25 fonttools 4.53.0 fqdn 1.5.1 frozendict 2.4.4 frozenlist 1.4.1 fsspec 2023.10.0 gast 0.5.4 gdown 5.2.0 gekko 1.1.1 gradio 4.26.0 gradio_client 0.15.1 greenlet 3.0.3 grpcio 1.65.1 h11 0.14.0 hf_transfer 0.1.6 hiredis 2.3.2 httpcore 1.0.5 httptools 0.6.1 httpx 0.27.0 huggingface-hub 0.23.3 humanfriendly 10.0 hydra-colorlog 1.2.0 hydra-core 1.3.2 hydra-optuna-sweeper 1.2.0 HyperPyYAML 1.2.2 identify 2.6.0 idna 3.7 imageio 2.34.1 importlib_metadata 7.1.0 importlib_resources 6.4.0 inflect 7.2.1 iniconfig 2.0.0 interegular 0.3.3 ipykernel 6.29.5 ipython 8.26.0 ipywidgets 8.1.3 isodate 0.6.1 isoduration 20.11.0 jedi 0.19.1 Jinja2 3.1.4 jiter 0.4.1 jmespath 0.10.0 joblib 1.4.2 json5 0.9.25 jsonpointer 3.0.0 jsonschema 4.22.0 jsonschema-specifications 2023.12.1 jupyter_client 8.6.2 jupyter_core 5.7.2 jupyter-events 0.10.0 jupyter-lsp 2.2.5 jupyter_server 2.14.2 jupyter_server_terminals 0.5.3 jupyterlab 4.2.4 jupyterlab_pygments 0.3.0 jupyterlab_server 2.27.3 jupyterlab_widgets 3.0.11 kiwisolver 1.4.5 language-tags 1.2.0 lark 1.1.9 lazy_loader 0.4 libnacl 2.1.0 librosa 0.10.2.post1 lightning 2.3.3 lightning-utilities 0.11.6 litellm 1.40.15 llama_cpp_python 0.2.77 llvmlite 0.42.0 lm-format-enforcer 0.10.1 lxml 5.2.2 Mako 1.3.5 Markdown 3.6 markdown-it-py 3.0.0 MarkupSafe 2.1.5 matcha-tts 0.0.5.1 matplotlib 3.9.0 matplotlib-inline 0.1.7 mdurl 0.1.2 mistune 3.0.2 modelscope 1.15.0 more-itertools 10.2.0 mpmath 1.3.0 msgpack 1.0.8 multidict 6.0.5 multiprocess 0.70.16 nbclient 0.10.0 nbconvert 7.16.4 nbformat 5.10.4 nemo_text_processing 1.0.2 nest-asyncio 1.6.0 networkx 3.3 ninja 1.11.1.1 nodeenv 1.9.1 notebook 7.2.1 notebook_shim 0.2.4 numba 0.59.1 numpy 1.26.4 nvidia-cublas-cu12 12.1.3.1 nvidia-cuda-cupti-cu12 12.1.105 nvidia-cuda-nvrtc-cu12 12.1.105 nvidia-cuda-runtime-cu12 12.1.105 nvidia-cudnn-cu12 8.9.2.26 nvidia-cufft-cu12 11.0.2.54 nvidia-curand-cu12 10.3.2.106 nvidia-cusolver-cu12 11.4.5.107 nvidia-cusparse-cu12 12.1.0.106 nvidia-ml-py 12.555.43 nvidia-nccl-cu12 2.20.5 nvidia-nvjitlink-cu12 12.5.40 nvidia-nvtx-cu12 12.1.105 omegaconf 2.3.0 onnxruntime 1.16.0 openai 1.33.0 openai-whisper 20231117 opencv-contrib-python 4.10.0.82 opencv-python 4.10.0.82 opencv-python-headless 4.10.0.82 optimum 1.21.2 optuna 2.10.1 orjson 3.10.3 oss2 2.18.5 outlines 0.0.34 overrides 7.7.0 packaging 24.1 pandas 2.2.2 pandocfilters 1.5.1 parso 0.8.4 passlib 1.7.4 pbr 6.0.0 peft 0.11.1 pexpect 4.9.0 phonemizer 3.2.1 pillow 10.3.0 pip 24.2 pip-review 1.3.0 piper-phonemize 1.1.0 platformdirs 4.2.2 pluggy 1.5.0 plumbum 1.8.3 pooch 1.8.2 pre-commit 3.7.1 prettytable 3.10.2 prometheus_client 0.20.0 prometheus-fastapi-instrumentator 7.0.0 prompt_toolkit 3.0.47 protobuf 4.25.4 psutil 5.9.8 ptyprocess 0.7.0 pure_eval 0.2.3 py-cpuinfo 9.0.0 pyarrow 16.1.0 pyarrow-hotfix 0.6 pyasn1 0.6.0 pybase16384 0.3.7 pycparser 2.22 pycryptodome 3.20.0 pydantic 2.7.3 pydantic_core 2.18.4 pydub 0.25.1 Pygments 2.18.0 pylatexenc 2.10 pynini 2.1.5 pynvml 11.5.0 pyparsing 3.1.2 pyperclip 1.9.0 PySocks 1.7.1 pytest 8.3.2 python-dateutil 2.9.0.post0 python-dotenv 1.0.1 python-jose 3.3.0 python-json-logger 2.0.7 python-multipart 0.0.9 pytorch-lightning 2.3.3 pytz 2024.1 PyYAML 6.0.1 pyzmq 26.0.3 quantile-python 1.1 ray 2.24.0 rdflib 7.0.0 redis 5.0.7 referencing 0.35.1 regex 2024.5.15 requests 2.32.3 rfc3339-validator 0.1.4 rfc3986 1.5.0 rfc3986-validator 0.1.1 rich 13.7.1 rootutils 1.0.7 rouge 1.0.1 rpds-py 0.18.1 rpyc 6.0.0 rsa 4.9 ruamel.yaml 0.18.6 ruamel.yaml.clib 0.2.8 ruff 0.4.8 s3fs 2023.10.0 s3transfer 0.7.0 sacremoses 0.1.1 safetensors 0.4.3 scikit-image 0.23.2 scikit-learn 1.5.0 scipy 1.13.1 seaborn 0.13.2 segments 2.2.1 semantic-version 2.10.0 Send2Trash 1.8.3 sentence-transformers 3.0.1 sentencepiece 0.2.0 setuptools 70.0.0 sglang 0.1.17 shellingham 1.5.4 simplejson 3.19.2 six 1.16.0 sniffio 1.3.1 socksio 1.0.0 sortedcontainers 2.4.0 soundfile 0.12.1 soupsieve 2.5 soxr 0.3.7 SQLAlchemy 2.0.31 sse-starlette 2.1.0 stack-data 0.6.3 starlette 0.37.2 stevedore 5.2.0 sympy 1.12.1 tabulate 0.9.0 tblib 3.0.0 tensorboard 2.17.0 tensorboard-data-server 0.7.2 tensorizer 2.9.0 terminado 0.18.1 threadpoolctl 3.5.0 tifffile 2024.5.22 tiktoken 0.7.0 timm 1.0.3 tinycss2 1.3.0 tokenizers 0.19.1 tomli 2.0.1 tomlkit 0.12.0 toolz 0.12.1 torch 2.3.0 torchaudio 2.3.0 torchmetrics 1.4.0.post0 torchvision 0.18.0 tornado 6.4.1 tqdm 4.66.4 traitlets 5.14.3 transformers 4.43.3 transformers-stream-generator 0.0.5 triton 2.3.0 typeguard 4.3.0 typer 0.11.1 types-python-dateutil 2.9.0.20240316 typing_extensions 4.12.2 tzdata 2024.1 ujson 5.10.0 Unidecode 1.3.8 uri-template 1.3.0 uritemplate 4.1.1 urllib3 2.0.7 uvicorn 0.30.1 uvloop 0.19.0 vector-quantize-pytorch 1.14.24 virtualenv 20.26.3 vllm 0.4.3 vllm-flash-attn 2.5.8.post2 vocos 0.1.0 watchfiles 0.22.0 wcwidth 0.2.13 webcolors 24.6.0 webencodings 0.5.1 websocket-client 1.8.0 websockets 11.0.3 Werkzeug 3.0.3 WeTextProcessing 1.0.1 wget 3.2 wheel 0.43.0 widgetsnbextension 4.0.11 wrapt 1.16.0 xformers 0.0.26.post1 xinference 0.13.3 xoscar 0.3.0 xxhash 3.4.1 yapf 0.40.2 yarl 1.9.4 zipp 3.19.2 zmq 0.0.0 zstandard 0.22.0

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?

Version info / 版本信息

0.13.3

The command used to start Xinference / 用以启动 xinference 的命令

xinference-local -H 0.0.0.0

Reproduction / 复现过程

image

Server error: 400 - [address=0.0.0.0:43423, pid=200222] Couldn't instantiate the backend tokenizer from one of: (1) a 'tokenizers' library serialization file, (2) a slow tokenizer instance to convert or (3) an equivalent slow tokenizer class to instantiate and convert. You need to have sentencepiece installed to convert a slow tokenizer to a fast one.

Expected behavior / 期待表现

llama-3.1-instruct模型正常启动

qinxuye commented 3 months ago
pip install sentencepiece
LucisBaoshg commented 3 months ago
pip install sentencepiece

Requirement already satisfied: sentencepiece in ./anaconda3/envs/xinference/lib/python3.11/site-packages (0.2.0)