bentoml / OpenLLM

Run any open-source LLMs, such as Llama 3.1, Gemma, as OpenAI compatible API endpoint in the cloud.
https://bentoml.com
Apache License 2.0
9.83k stars 626 forks source link

bug: AttributeError: can't set attribute 'eos_token' #736

Closed lovivi closed 2 months ago

lovivi commented 10 months ago

Describe the bug

When I execute TRUST on the P40 card REMOTE CODE=True openllm start/NAS/user/songjie/software/llm/chatglm3-6b Unable to load properly (can build normally, whether it is pt or vllm, but will report an error whenever requested)

The error is as follows:

To reproduce

card REMOTE CODE=True openllm start/NAS/user/songjie/software/llm/chatglm3-6b

Logs

2023-11-27T18:22:06+0800 [ERROR] [runner:llm-chatglm-runner:1] Application startup failed. Exiting.

Loading checkpoint shards:   0%|          | 0/7 [00:00<?, ?it/s]
Loading checkpoint shards:  14%|█▍        | 1/7 [00:01<00:10,  1.81s/it]
Loading checkpoint shards:  29%|██▊       | 2/7 [00:03<00:09,  1.81s/it]
Loading checkpoint shards:  43%|████▎     | 3/7 [00:05<00:07,  1.79s/it]
Loading checkpoint shards:  57%|█████▋    | 4/7 [00:07<00:05,  1.75s/it]
Loading checkpoint shards:  71%|███████▏  | 5/7 [00:08<00:03,  1.75s/it]
Loading checkpoint shards:  86%|████████▌ | 6/7 [00:10<00:01,  1.71s/it]
Loading checkpoint shards: 100%|██████████| 7/7 [00:11<00:00,  1.45s/it]
Loading checkpoint shards: 100%|██████████| 7/7 [00:11<00:00,  1.63s/it]
2023-11-27T18:22:32+0800 [ERROR] [runner:llm-chatglm-runner:1] An exception occurred while instantiating runner 'llm-chatglm-runner', see details below:
2023-11-27T18:22:32+0800 [ERROR] [runner:llm-chatglm-runner:1] Traceback (most recent call last):
File "/NAS/user/songjie/.conda/envs/vllm/lib/python3.10/site-packages/bentoml/_internal/runner/runner.py", line 307, in init_local
self._set_handle(LocalRunnerRef)
File "/NAS/user/songjie/.conda/envs/vllm/lib/python3.10/site-packages/bentoml/_internal/runner/runner.py", line 150, in _set_handle
runner_handle = handle_class(self, *args, **kwargs)
File "/NAS/user/songjie/.conda/envs/vllm/lib/python3.10/site-packages/bentoml/_internal/runner/runner_handle/local.py", line 27, in __init__
self._runnable = runner.runnable_class(**runner.runnable_init_params)  # type: ignore
File "/NAS/user/songjie/.conda/envs/vllm/lib/python3.10/site-packages/openllm/_runners.py", line 172, in __init__
self.llm, self.config, self.model, self.tokenizer = llm, llm.config, llm.model, llm.tokenizer
File "/NAS/user/songjie/.conda/envs/vllm/lib/python3.10/site-packages/openllm/_llm.py", line 326, in tokenizer
if self.__llm_tokenizer__ is None: self.__llm_tokenizer__ = openllm.serialisation.load_tokenizer(self, **self.llm_parameters[-1])
File "/NAS/user/songjie/.conda/envs/vllm/lib/python3.10/site-packages/openllm/serialisation/__init__.py", line 44, in load_tokenizer
tokenizer = AutoTokenizer.from_pretrained(
File "/NAS/user/songjie/.conda/envs/vllm/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 755, in from_pretrained
return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
File "/NAS/user/songjie/.conda/envs/vllm/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2024, in from_pretrained
return cls._from_pretrained(
File "/NAS/user/songjie/.conda/envs/vllm/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2256, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "/NAS/user/songjie/.cache/huggingface/modules/transformers_modules/tokenization_chatglm.py", line 108, in __init__
super().__init__(padding_side=padding_side, clean_up_tokenization_spaces=clean_up_tokenization_spaces,
File "/NAS/user/songjie/.conda/envs/vllm/lib/python3.10/site-packages/transformers/tokenization_utils.py", line 363, in __init__
super().__init__(**kwargs)
File "/NAS/user/songjie/.conda/envs/vllm/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1604, in __init__
super().__init__(**kwargs)
File "/NAS/user/songjie/.conda/envs/vllm/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 861, in __init__
setattr(self, key, value)
AttributeError: can't set attribute 'eos_token'

Environment

Driver Version: 535.129.03
cuda-python 12.3.0 pypi_0 pypi cudatoolkit 11.8.0 h6a678d5_0 python 3.10.13 h955ad1f_0 openllm 0.4.31 pypi_0 pypi openllm-client 0.4.31 pypi_0 pypi openllm-core 0.4.31 pypi_0 pypi

vllm 0.2.2 torch 2.1.0

System information (Optional)

No response

waltcow commented 10 months ago

run chatglm3-6b model with error log as follow

File "/root/miniconda3/envs/openllm/lib/python3.10/site-packages/starlette/routing.py", line 705, in lifespan
async with self.lifespan_context(app) as maybe_state:
File "/root/miniconda3/envs/openllm/lib/python3.10/contextlib.py", line 199, in __aenter__
return await anext(self.gen)
File "/root/miniconda3/envs/openllm/lib/python3.10/site-packages/bentoml/_internal/server/base_app.py", line 75, in lifespan
on_startup()
File "/root/miniconda3/envs/openllm/lib/python3.10/site-packages/bentoml/_internal/runner/runner.py", line 317, in init_local
raise e
File "/root/miniconda3/envs/openllm/lib/python3.10/site-packages/bentoml/_internal/runner/runner.py", line 307, in init_local
self._set_handle(LocalRunnerRef)
File "/root/miniconda3/envs/openllm/lib/python3.10/site-packages/bentoml/_internal/runner/runner.py", line 150, in _set_handle
runner_handle = handle_class(self, *args, **kwargs)
File "/root/miniconda3/envs/openllm/lib/python3.10/site-packages/bentoml/_internal/runner/runner_handle/local.py", line 27, in __init__
self._runnable = runner.runnable_class(**runner.runnable_init_params)  # type: ignore
File "/root/miniconda3/envs/openllm/lib/python3.10/site-packages/openllm/_runners.py", line 172, in __init__
self.llm, self.config, self.model, self.tokenizer = llm, llm.config, llm.model, llm.tokenizer
File "/root/miniconda3/envs/openllm/lib/python3.10/site-packages/openllm/_llm.py", line 326, in tokenizer
if self.__llm_tokenizer__ is None: self.__llm_tokenizer__ = openllm.serialisation.load_tokenizer(self, **self.llm_parameters[-1])
File "/root/miniconda3/envs/openllm/lib/python3.10/site-packages/openllm/serialisation/__init__.py", line 44, in load_tokenizer
tokenizer = AutoTokenizer.from_pretrained(
File "/root/miniconda3/envs/openllm/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 755, in from_pretrained
return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
File "/root/miniconda3/envs/openllm/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2024, in from_pretrained
return cls._from_pretrained(
File "/root/miniconda3/envs/openllm/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2256, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/tokenization_chatglm.py", line 108, in __init__
super().__init__(padding_side=padding_side, clean_up_tokenization_spaces=clean_up_tokenization_spaces,
File "/root/miniconda3/envs/openllm/lib/python3.10/site-packages/transformers/tokenization_utils.py", line 363, in __init__
super().__init__(**kwargs)
File "/root/miniconda3/envs/openllm/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1604, in __init__
super().__init__(**kwargs)
File "/root/miniconda3/envs/openllm/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 861, in __init__
setattr(self, key, value)
AttributeError: can't set attribute 'eos_token'

2023-11-29T18:25:36+0800 [ERROR] [runner:llm-chatglm-runner:1] Application startup failed. Exiting.
danerlt commented 9 months ago

I have also encountered the same problem.

env:

Fri Jan  5 16:28:57 2024       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.223.02   Driver Version: 470.223.02   CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA A100-PCI...  Off  | 00000000:61:00.0 Off |                    0 |
| N/A   31C    P0    36W / 250W |  12780MiB / 40536MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
|   1  NVIDIA A100-PCI...  Off  | 00000000:DB:00.0 Off |                    0 |
| N/A   35C    P0    31W / 250W |      3MiB / 40536MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A    142549      C   ...nda3/envs/chat/bin/python    12777MiB |
+-----------------------------------------------------------------------------+

python: 3.10 openllm: 0.4.41

my start cmd is :

cmd="env CUDA_VISIBLE_DEVICES=0,1 OPENBLAS_NUM_THREADS=1 TRUST_REMOTE_CODE=True openllm start /data/models/chatglm3-6b  --backend pt -p 3333"
nohup $cmd > chatglm.log 2>&1 &

pip list:

 pip list
Package                                      Version
-------------------------------------------- ---------------
accelerate                                   0.25.0
aiohttp                                      3.9.1
aioprometheus                                23.12.0
aiosignal                                    1.3.1
annotated-types                              0.6.0
anyio                                        4.2.0
appdirs                                      1.4.4
asgiref                                      3.7.2
asttokens                                    2.4.1
async-timeout                                4.0.3
attrs                                        23.2.0
beautifulsoup4                               4.12.2
bentoml                                      1.1.11
bitsandbytes                                 0.41.3.post2
build                                        0.10.0
cattrs                                       23.1.2
certifi                                      2023.11.17
charset-normalizer                           3.3.2
circus                                       0.18.0
click                                        8.1.7
click-option-group                           0.5.6
cloudpickle                                  3.0.0
cmake                                        3.28.1
coloredlogs                                  15.0.1
contextlib2                                  21.6.0
cpm-kernels                                  1.0.11
cuda-python                                  12.3.0
dashscope                                    1.13.6
dataclasses-json                             0.6.3
datasets                                     2.16.1
decorator                                    5.1.1
deepmerge                                    1.1.1
Deprecated                                   1.2.14
dill                                         0.3.7
distlib                                      0.3.8
distro                                       1.9.0
docarray                                     0.40.0
einops                                       0.7.0
exceptiongroup                               1.2.0
executing                                    2.0.1
fastapi                                      0.108.0
fastcore                                     1.5.29
filelock                                     3.13.1
filetype                                     1.2.0
frozenlist                                   1.4.1
fs                                           2.4.16
fsspec                                       2023.10.0
ghapi                                        1.0.4
greenlet                                     3.0.3
grpcio                                       1.60.0
h11                                          0.14.0
html2text                                    2020.1.16
httpcore                                     1.0.2
httptools                                    0.6.1
httpx                                        0.26.0
huggingface-hub                              0.20.1
humanfriendly                                10.0
idna                                         3.6
importlib-metadata                           6.11.0
inflection                                   0.5.1
ipython                                      8.19.0
jedi                                         0.19.1
Jinja2                                       3.1.2
joblib                                       1.3.2
jsonpatch                                    1.33
jsonpointer                                  2.4
jsonschema                                   4.20.0
jsonschema-specifications                    2023.12.1
langchain                                    0.0.354
langchain-community                          0.0.8
langchain-core                               0.1.6
langsmith                                    0.0.77
lit                                          17.0.6
llama-hub                                    0.0.66
llama-index                                  0.9.25
loguru                                       0.7.2
markdown-it-py                               3.0.0
MarkupSafe                                   2.1.3
marshmallow                                  3.20.1
matplotlib-inline                            0.1.6
mdurl                                        0.1.2
mpmath                                       1.3.0
msgpack                                      1.0.7
multidict                                    6.0.4
multiprocess                                 0.70.15
mypy-extensions                              1.0.0
nest-asyncio                                 1.5.8
networkx                                     3.2.1
ninja                                        1.11.1.1
nltk                                         3.8.1
numpy                                        1.26.3
nvidia-cublas-cu12                           12.1.3.1
nvidia-cuda-cupti-cu12                       12.1.105
nvidia-cuda-nvrtc-cu12                       12.1.105
nvidia-cuda-runtime-cu12                     12.1.105
nvidia-cudnn-cu12                            8.9.2.26
nvidia-cufft-cu12                            11.0.2.54
nvidia-curand-cu12                           10.3.2.106
nvidia-cusolver-cu12                         11.4.5.107
nvidia-cusparse-cu12                         12.1.0.106
nvidia-ml-py                                 11.525.150
nvidia-nccl-cu12                             2.18.1
nvidia-nvjitlink-cu12                        12.3.101
nvidia-nvtx-cu12                             12.1.105
openai                                       1.6.1
openllm                                      0.4.41
openllm-client                               0.4.41
openllm-core                                 0.4.41
opentelemetry-api                            1.20.0
opentelemetry-instrumentation                0.41b0
opentelemetry-instrumentation-aiohttp-client 0.41b0
opentelemetry-instrumentation-asgi           0.41b0
opentelemetry-sdk                            1.20.0
opentelemetry-semantic-conventions           0.41b0
opentelemetry-util-http                      0.41b0
optimum                                      1.16.1
orjson                                       3.9.10
packaging                                    23.2
pandas                                       2.1.4
parso                                        0.8.3
pathspec                                     0.12.1
pexpect                                      4.9.0
pillow                                       10.2.0
pip                                          23.3.2
pip-requirements-parser                      32.0.1
pip-tools                                    7.3.0
platformdirs                                 4.1.0
prometheus-client                            0.19.0
prompt-toolkit                               3.0.43
protobuf                                     4.25.1
psutil                                       5.9.7
ptyprocess                                   0.7.0
pure-eval                                    0.2.2
pyaml                                        23.12.0
pyarrow                                      14.0.2
pyarrow-hotfix                               0.6
pydantic                                     1.10.13
pydantic_core                                2.14.6
Pygments                                     2.17.2
pyparsing                                    3.1.1
pyproject_hooks                              1.0.0
python-dateutil                              2.8.2
python-dotenv                                1.0.0
python-json-logger                           2.0.7
python-multipart                             0.0.6
pytz                                         2023.3.post1
PyYAML                                       6.0.1
pyzmq                                        25.1.2
quantile-python                              1.1
ray                                          2.6.0
referencing                                  0.32.0
regex                                        2023.12.25
requests                                     2.31.0
retrying                                     1.3.4
rich                                         13.7.0
rpds-py                                      0.16.2
safetensors                                  0.4.1
schema                                       0.7.5
scipy                                        1.11.4
sentencepiece                                0.1.99
setuptools                                   69.0.3
simple-di                                    0.1.5
six                                          1.16.0
sniffio                                      1.3.0
soupsieve                                    2.5
SQLAlchemy                                   2.0.25
stack-data                                   0.6.3
starlette                                    0.32.0.post1
sympy                                        1.12
tenacity                                     8.2.3
tiktoken                                     0.5.2
tokenizers                                   0.15.0
tomli                                        2.0.1
torch                                        2.0.1+cu117
torchaudio                                   2.0.2+cu117
torchvision                                  0.15.2+cu117
tornado                                      6.4
tqdm                                         4.66.1
traitlets                                    5.14.1
transformers                                 4.36.2
transformers-stream-generator                0.0.4
triton                                       2.0.0
types-requests                               2.31.0.20231231
typing_extensions                            4.9.0
typing-inspect                               0.9.0
tzdata                                       2023.4
urllib3                                      2.1.0
uvicorn                                      0.25.0
uvloop                                       0.19.0
virtualenv                                   20.25.0
vllm                                         0.2.6
watchfiles                                   0.21.0
wcwidth                                      0.2.12
websockets                                   12.0
wheel                                        0.42.0
wrapt                                        1.16.0
xformers                                     0.0.23.post1
xxhash                                       3.4.1
yarl                                         1.9.4
zipp                                         3.17.0
bojiang commented 2 months ago

close for openllm 0.6