xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
Error: Could not instantiate the backend tokenizer #1993

opened 4 months ago

LucisBaoshg commented 4 months ago

System Info

Distributor ID: Ubuntu Description: Ubuntu 22.04.4 LTS Python 3.11.8

Cuda compilation tools, release 12.1, V12.1.105 Build cuda_12.1.r12.1/compiler.32688072_0 transformers 4.43.3 Package Version

Running Xinference with Docker?

Version info


The command used to start Xinference

xinference-local -H

Reproduction


Server error: 400 - [address=, pid=200222] Couldn't instantiate the backend tokenizer from one of: (1) a 'tokenizers' library serialization file, (2) a slow tokenizer instance to convert or (3) an equivalent slow tokenizer class to instantiate and convert. You need to have sentencepiece installed to convert a slow tokenizer to a fast one.

Expected behavior


qinxuye commented 4 months ago
pip install sentencepiece
LucisBaoshg commented 3 months ago
pip install sentencepiece

Requirement already satisfied: sentencepiece in ./anaconda3/envs/xinference/lib/python3.11/site-packages (0.2.0)