When I used these lines:
model=openchat/openchat-3.5-0106-gemma
volume=$PWD/data
docker run --gpus all --shm-size 1g -p 9090:80 -v $volume:/data ghcr.io/huggingface/text-generation-inference:latest --model-id $model
I got this error:
File "/opt/conda/lib/python3.10/site-packages/transformers/tokenization_utils_fast.py", line 120, in init
raise ValueError(
ValueError: Couldn't instantiate the backend tokenizer from one of:
(1) a tokenizers library serialization file,
(2) a slow tokenizer instance to convert or
(3) an equivalent slow tokenizer class to instantiate and convert.
You need to have sentencepiece installed to convert a slow tokenizer to a fast one.
These lines are fine when I used: openchat/openchat-3.5-0106
When I used these lines: model=openchat/openchat-3.5-0106-gemma volume=$PWD/data docker run --gpus all --shm-size 1g -p 9090:80 -v $volume:/data ghcr.io/huggingface/text-generation-inference:latest --model-id $model
I got this error:
File "/opt/conda/lib/python3.10/site-packages/transformers/tokenization_utils_fast.py", line 120, in init raise ValueError(
ValueError: Couldn't instantiate the backend tokenizer from one of: (1) a
tokenizers
library serialization file, (2) a slow tokenizer instance to convert or (3) an equivalent slow tokenizer class to instantiate and convert. You need to have sentencepiece installed to convert a slow tokenizer to a fast one.These lines are fine when I used: openchat/openchat-3.5-0106