bug: AttributeError: can't set attribute 'eos_token'

bentoml / OpenLLM

Run any open-source LLMs, such as Llama 3.1, Gemma, as OpenAI compatible API endpoint in the cloud.

Apache License 2.0

9.83k stars 626 forks source link

2023-11-27T18:22:06+0800 [ERROR] [runner:llm-chatglm-runner:1] Application startup failed. Exiting. Loading checkpoint shards: 0%| | 0/7 [00:00<?, ?it/s] Loading checkpoint shards: 14%|█▍ | 1/7 [00:01<00:10, 1.81s/it] Loading checkpoint shards: 29%|██▊ | 2/7 [00:03<00:09, 1.81s/it] Loading checkpoint shards: 43%|████▎ | 3/7 [00:05<00:07, 1.79s/it] Loading checkpoint shards: 57%|█████▋ | 4/7 [00:07<00:05, 1.75s/it] Loading checkpoint shards: 71%|███████▏ | 5/7 [00:08<00:03, 1.75s/it] Loading checkpoint shards: 86%|████████▌ | 6/7 [00:10<00:01, 1.71s/it] Loading checkpoint shards: 100%|██████████| 7/7 [00:11<00:00, 1.45s/it] Loading checkpoint shards: 100%|██████████| 7/7 [00:11<00:00, 1.63s/it] 2023-11-27T18:22:32+0800 [ERROR] [runner:llm-chatglm-runner:1] An exception occurred while instantiating runner 'llm-chatglm-runner', see details below: 2023-11-27T18:22:32+0800 [ERROR] [runner:llm-chatglm-runner:1] Traceback (most recent call last): File "/NAS/user/songjie/.conda/envs/vllm/lib/python3.10/site-packages/bentoml/_internal/runner/runner.py", line 307, in init_local self._set_handle(LocalRunnerRef) File "/NAS/user/songjie/.conda/envs/vllm/lib/python3.10/site-packages/bentoml/_internal/runner/runner.py", line 150, in _set_handle runner_handle = handle_class(self, *args, **kwargs) File "/NAS/user/songjie/.conda/envs/vllm/lib/python3.10/site-packages/bentoml/_internal/runner/runner_handle/local.py", line 27, in __init__ self._runnable = runner.runnable_class(**runner.runnable_init_params) # type: ignore File "/NAS/user/songjie/.conda/envs/vllm/lib/python3.10/site-packages/openllm/_runners.py", line 172, in __init__ self.llm, self.config, self.model, self.tokenizer = llm, llm.config, llm.model, llm.tokenizer File "/NAS/user/songjie/.conda/envs/vllm/lib/python3.10/site-packages/openllm/_llm.py", line 326, in tokenizer if self.__llm_tokenizer__ is None: self.__llm_tokenizer__ = openllm.serialisation.load_tokenizer(self, **self.llm_parameters[-1]) File "/NAS/user/songjie/.conda/envs/vllm/lib/python3.10/site-packages/openllm/serialisation/__init__.py", line 44, in load_tokenizer tokenizer = AutoTokenizer.from_pretrained( File "/NAS/user/songjie/.conda/envs/vllm/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 755, in from_pretrained return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs) File "/NAS/user/songjie/.conda/envs/vllm/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2024, in from_pretrained return cls._from_pretrained( File "/NAS/user/songjie/.conda/envs/vllm/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2256, in _from_pretrained tokenizer = cls(*init_inputs, **init_kwargs) File "/NAS/user/songjie/.cache/huggingface/modules/transformers_modules/tokenization_chatglm.py", line 108, in __init__ super().__init__(padding_side=padding_side, clean_up_tokenization_spaces=clean_up_tokenization_spaces, File "/NAS/user/songjie/.conda/envs/vllm/lib/python3.10/site-packages/transformers/tokenization_utils.py", line 363, in __init__ super().__init__(**kwargs) File "/NAS/user/songjie/.conda/envs/vllm/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1604, in __init__ super().__init__(**kwargs) File "/NAS/user/songjie/.conda/envs/vllm/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 861, in __init__ setattr(self, key, value) AttributeError: can't set attribute 'eos_token'

File "/root/miniconda3/envs/openllm/lib/python3.10/site-packages/starlette/routing.py", line 705, in lifespan async with self.lifespan_context(app) as maybe_state: File "/root/miniconda3/envs/openllm/lib/python3.10/contextlib.py", line 199, in __aenter__ return await anext(self.gen) File "/root/miniconda3/envs/openllm/lib/python3.10/site-packages/bentoml/_internal/server/base_app.py", line 75, in lifespan on_startup() File "/root/miniconda3/envs/openllm/lib/python3.10/site-packages/bentoml/_internal/runner/runner.py", line 317, in init_local raise e File "/root/miniconda3/envs/openllm/lib/python3.10/site-packages/bentoml/_internal/runner/runner.py", line 307, in init_local self._set_handle(LocalRunnerRef) File "/root/miniconda3/envs/openllm/lib/python3.10/site-packages/bentoml/_internal/runner/runner.py", line 150, in _set_handle runner_handle = handle_class(self, *args, **kwargs) File "/root/miniconda3/envs/openllm/lib/python3.10/site-packages/bentoml/_internal/runner/runner_handle/local.py", line 27, in __init__ self._runnable = runner.runnable_class(**runner.runnable_init_params) # type: ignore File "/root/miniconda3/envs/openllm/lib/python3.10/site-packages/openllm/_runners.py", line 172, in __init__ self.llm, self.config, self.model, self.tokenizer = llm, llm.config, llm.model, llm.tokenizer File "/root/miniconda3/envs/openllm/lib/python3.10/site-packages/openllm/_llm.py", line 326, in tokenizer if self.__llm_tokenizer__ is None: self.__llm_tokenizer__ = openllm.serialisation.load_tokenizer(self, **self.llm_parameters[-1]) File "/root/miniconda3/envs/openllm/lib/python3.10/site-packages/openllm/serialisation/__init__.py", line 44, in load_tokenizer tokenizer = AutoTokenizer.from_pretrained( File "/root/miniconda3/envs/openllm/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 755, in from_pretrained return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs) File "/root/miniconda3/envs/openllm/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2024, in from_pretrained return cls._from_pretrained( File "/root/miniconda3/envs/openllm/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2256, in _from_pretrained tokenizer = cls(*init_inputs, **init_kwargs) File "/root/.cache/huggingface/modules/transformers_modules/tokenization_chatglm.py", line 108, in __init__ super().__init__(padding_side=padding_side, clean_up_tokenization_spaces=clean_up_tokenization_spaces, File "/root/miniconda3/envs/openllm/lib/python3.10/site-packages/transformers/tokenization_utils.py", line 363, in __init__ super().__init__(**kwargs) File "/root/miniconda3/envs/openllm/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1604, in __init__ super().__init__(**kwargs) File "/root/miniconda3/envs/openllm/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 861, in __init__ setattr(self, key, value) AttributeError: can't set attribute 'eos_token' 2023-11-29T18:25:36+0800 [ERROR] [runner:llm-chatglm-runner:1] Application startup failed. Exiting.

I have also encountered the same problem.

env:

Fri Jan  5 16:28:57 2024       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.223.02   Driver Version: 470.223.02   CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA A100-PCI...  Off  | 00000000:61:00.0 Off |                    0 |
| N/A   31C    P0    36W / 250W |  12780MiB / 40536MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
|   1  NVIDIA A100-PCI...  Off  | 00000000:DB:00.0 Off |                    0 |
| N/A   35C    P0    31W / 250W |      3MiB / 40536MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A    142549      C   ...nda3/envs/chat/bin/python    12777MiB |
+-----------------------------------------------------------------------------+

python: 3.10 openllm: 0.4.41

my start cmd is :

cmd="env CUDA_VISIBLE_DEVICES=0,1 OPENBLAS_NUM_THREADS=1 TRUST_REMOTE_CODE=True openllm start /data/models/chatglm3-6b  --backend pt -p 3333"
nohup $cmd > chatglm.log 2>&1 &

pip list:

 pip list
Package                                      Version
-------------------------------------------- ---------------
accelerate                                   0.25.0
aiohttp                                      3.9.1
aioprometheus                                23.12.0
aiosignal                                    1.3.1
annotated-types                              0.6.0
anyio                                        4.2.0
appdirs                                      1.4.4
asgiref                                      3.7.2
asttokens                                    2.4.1
async-timeout                                4.0.3
attrs                                        23.2.0
beautifulsoup4                               4.12.2
bentoml                                      1.1.11
bitsandbytes                                 0.41.3.post2
build                                        0.10.0
cattrs                                       23.1.2
certifi                                      2023.11.17
charset-normalizer                           3.3.2
circus                                       0.18.0
click                                        8.1.7
click-option-group                           0.5.6
cloudpickle                                  3.0.0
cmake                                        3.28.1
coloredlogs                                  15.0.1
contextlib2                                  21.6.0
cpm-kernels                                  1.0.11
cuda-python                                  12.3.0
dashscope                                    1.13.6
dataclasses-json                             0.6.3
datasets                                     2.16.1
decorator                                    5.1.1
deepmerge                                    1.1.1
Deprecated                                   1.2.14
dill                                         0.3.7
distlib                                      0.3.8
distro                                       1.9.0
docarray                                     0.40.0
einops                                       0.7.0
exceptiongroup                               1.2.0
executing                                    2.0.1
fastapi                                      0.108.0
fastcore                                     1.5.29
filelock                                     3.13.1
filetype                                     1.2.0
frozenlist                                   1.4.1
fs                                           2.4.16
fsspec                                       2023.10.0
ghapi                                        1.0.4
greenlet                                     3.0.3
grpcio                                       1.60.0
h11                                          0.14.0
html2text                                    2020.1.16
httpcore                                     1.0.2
httptools                                    0.6.1
httpx                                        0.26.0
huggingface-hub                              0.20.1
humanfriendly                                10.0
idna                                         3.6
importlib-metadata                           6.11.0
inflection                                   0.5.1
ipython                                      8.19.0
jedi                                         0.19.1
Jinja2                                       3.1.2
joblib                                       1.3.2
jsonpatch                                    1.33
jsonpointer                                  2.4
jsonschema                                   4.20.0
jsonschema-specifications                    2023.12.1
langchain                                    0.0.354
langchain-community                          0.0.8
langchain-core                               0.1.6
langsmith                                    0.0.77
lit                                          17.0.6
llama-hub                                    0.0.66
llama-index                                  0.9.25
loguru                                       0.7.2
markdown-it-py                               3.0.0
MarkupSafe                                   2.1.3
marshmallow                                  3.20.1
matplotlib-inline                            0.1.6
mdurl                                        0.1.2
mpmath                                       1.3.0
msgpack                                      1.0.7
multidict                                    6.0.4
multiprocess                                 0.70.15
mypy-extensions                              1.0.0
nest-asyncio                                 1.5.8
networkx                                     3.2.1
ninja                                        1.11.1.1
nltk                                         3.8.1
numpy                                        1.26.3
nvidia-cublas-cu12                           12.1.3.1
nvidia-cuda-cupti-cu12                       12.1.105
nvidia-cuda-nvrtc-cu12                       12.1.105
nvidia-cuda-runtime-cu12                     12.1.105
nvidia-cudnn-cu12                            8.9.2.26
nvidia-cufft-cu12                            11.0.2.54
nvidia-curand-cu12                           10.3.2.106
nvidia-cusolver-cu12                         11.4.5.107
nvidia-cusparse-cu12                         12.1.0.106
nvidia-ml-py                                 11.525.150
nvidia-nccl-cu12                             2.18.1
nvidia-nvjitlink-cu12                        12.3.101
nvidia-nvtx-cu12                             12.1.105
openai                                       1.6.1
openllm                                      0.4.41
openllm-client                               0.4.41
openllm-core                                 0.4.41
opentelemetry-api                            1.20.0
opentelemetry-instrumentation                0.41b0
opentelemetry-instrumentation-aiohttp-client 0.41b0
opentelemetry-instrumentation-asgi           0.41b0
opentelemetry-sdk                            1.20.0
opentelemetry-semantic-conventions           0.41b0
opentelemetry-util-http                      0.41b0
optimum                                      1.16.1
orjson                                       3.9.10
packaging                                    23.2
pandas                                       2.1.4
parso                                        0.8.3
pathspec                                     0.12.1
pexpect                                      4.9.0
pillow                                       10.2.0
pip                                          23.3.2
pip-requirements-parser                      32.0.1
pip-tools                                    7.3.0
platformdirs                                 4.1.0
prometheus-client                            0.19.0
prompt-toolkit                               3.0.43
protobuf                                     4.25.1
psutil                                       5.9.7
ptyprocess                                   0.7.0
pure-eval                                    0.2.2
pyaml                                        23.12.0
pyarrow                                      14.0.2
pyarrow-hotfix                               0.6
pydantic                                     1.10.13
pydantic_core                                2.14.6
Pygments                                     2.17.2
pyparsing                                    3.1.1
pyproject_hooks                              1.0.0
python-dateutil                              2.8.2
python-dotenv                                1.0.0
python-json-logger                           2.0.7
python-multipart                             0.0.6
pytz                                         2023.3.post1
PyYAML                                       6.0.1
pyzmq                                        25.1.2
quantile-python                              1.1
ray                                          2.6.0
referencing                                  0.32.0
regex                                        2023.12.25
requests                                     2.31.0
retrying                                     1.3.4
rich                                         13.7.0
rpds-py                                      0.16.2
safetensors                                  0.4.1
schema                                       0.7.5
scipy                                        1.11.4
sentencepiece                                0.1.99
setuptools                                   69.0.3
simple-di                                    0.1.5
six                                          1.16.0
sniffio                                      1.3.0
soupsieve                                    2.5
SQLAlchemy                                   2.0.25
stack-data                                   0.6.3
starlette                                    0.32.0.post1
sympy                                        1.12
tenacity                                     8.2.3
tiktoken                                     0.5.2
tokenizers                                   0.15.0
tomli                                        2.0.1
torch                                        2.0.1+cu117
torchaudio                                   2.0.2+cu117
torchvision                                  0.15.2+cu117
tornado                                      6.4
tqdm                                         4.66.1
traitlets                                    5.14.1
transformers                                 4.36.2
transformers-stream-generator                0.0.4
triton                                       2.0.0
types-requests                               2.31.0.20231231
typing_extensions                            4.9.0
typing-inspect                               0.9.0
tzdata                                       2023.4
urllib3                                      2.1.0
uvicorn                                      0.25.0
uvloop                                       0.19.0
virtualenv                                   20.25.0
vllm                                         0.2.6
watchfiles                                   0.21.0
wcwidth                                      0.2.12
websockets                                   12.0
wheel                                        0.42.0
wrapt                                        1.16.0
xformers                                     0.0.23.post1
xxhash                                       3.4.1
yarl                                         1.9.4
zipp                                         3.17.0

bentoml / OpenLLM

bug: AttributeError: can't set attribute 'eos_token' #736

Describe the bug

To reproduce

Logs

Environment

System information (Optional)