[BUG]:From server.py: ValueError: The following `model_kwargs` are not used by the model: ['token_type_ids']

balcklive commented 1 year ago

🐛 Describe the bug

I run my server with this: python3 ./ColossalAI/applications/Chat/inference/server.py /home/ubuntu/modelpath/llama-7b/llama-7b/ --quant 8bit --http_host 0.0.0.0 --http_port 8080

then I call the api with this: import requests

import request data = {"history": [{"instruction":"where is the capital of USA", "response":""}], "max_new_tokens": 150, "top_k": 30, "top_p": 0.5, "temperature": 0.6} response = requests.post("http://localhost:8080/generate/stream", json=data) print(response.text)

As result I got this:

Traceback (most recent call last): File "/opt/conda/envs/deploy/lib/python3.9/site-packages/uvicorn/protocols/http/h11_impl.py", line 429, in run_asgi result = await app( # type: ignore[func-returns-value] File "/opt/conda/envs/deploy/lib/python3.9/site-packages/uvicorn/middleware/proxy_headers.py", line 78, in call return await self.app(scope, receive, send) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/fastapi/applications.py", line 276, in call await super().call(scope, receive, send) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/starlette/applications.py", line 122, in call await self.middleware_stack(scope, receive, send) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/starlette/middleware/errors.py", line 184, in call raise exc File "/opt/conda/envs/deploy/lib/python3.9/site-packages/starlette/middleware/errors.py", line 162, in call await self.app(scope, receive, _send) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/starlette/middleware/cors.py", line 84, in call await self.app(scope, receive, send) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/starlette/middleware/exceptions.py", line 79, in call raise exc File "/opt/conda/envs/deploy/lib/python3.9/site-packages/starlette/middleware/exceptions.py", line 68, in call await self.app(scope, receive, sender) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/fastapi/middleware/asyncexitstack.py", line 21, in call raise e File "/opt/conda/envs/deploy/lib/python3.9/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in call await self.app(scope, receive, send) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/starlette/routing.py", line 718, in call await route.handle(scope, receive, send) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/starlette/routing.py", line 276, in handle await self.app(scope, receive, send) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/starlette/routing.py", line 66, in app response = await func(request) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/fastapi/routing.py", line 237, in app raw_response = await run_endpoint_function( File "/opt/conda/envs/deploy/lib/python3.9/site-packages/fastapi/routing.py", line 165, in run_endpoint_function return await run_in_threadpool(dependant.call, values) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/starlette/concurrency.py", line 41, in run_in_threadpool return await anyio.to_thread.run_sync(func, args) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/anyio/to_thread.py", line 31, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "/opt/conda/envs/deploy/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread return await future File "/opt/conda/envs/deploy/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 867, in run result = context.run(func, args) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/slowapi/extension.py", line 762, in sync_wrapper response = func(*args, kwargs) File "/home/ubuntu/./ColossalAI/applications/Chat/inference/server.py", line 118, in generate_no_stream output = model.generate(inputs, *data.dict(exclude={'history'})) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(args, kwargs) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/transformers/generation/utils.py", line 1231, in generate self._validate_model_kwargs(model_kwargs.copy()) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/transformers/generation/utils.py", line 1109, in _validate_model_kwargs raise ValueError( ValueError: The following model_kwargs are not used by the model: ['token_type_ids'] (note: typos in the generate arguments will also show up in this list)

Does anybody know how to resolve this?

Environment

my package version is as below: Package Version

accelerate 0.18.0 anyio 3.6.2 bitsandbytes 0.37.2 Brotli 1.0.9 certifi 2022.12.7 charset-normalizer 3.1.0 click 8.1.3 cmake 3.26.1 ConfigArgParse 1.5.3 Deprecated 1.2.13 fastapi 0.95.0 filelock 3.11.0 Flask 2.2.3 Flask-BasicAuth 0.2.0 Flask-Cors 3.0.10 gevent 22.10.2 geventhttpclient 2.0.9 greenlet 2.0.2 h11 0.14.0 huggingface-hub 0.13.4 idna 3.4 importlib-metadata 6.2.0 importlib-resources 5.12.0 itsdangerous 2.1.2 jieba 0.42.1 Jinja2 3.1.2 limits 3.3.1 lit 16.0.0 locust 2.15.1 MarkupSafe 2.1.2 mpmath 1.3.0 msgpack 1.0.5 networkx 3.1 numpy 1.24.2 nvidia-cublas-cu11 11.10.3.66 nvidia-cuda-cupti-cu11 11.7.101 nvidia-cuda-nvrtc-cu11 11.7.99 nvidia-cuda-runtime-cu11 11.7.99 nvidia-cudnn-cu11 8.5.0.96 nvidia-cufft-cu11 10.9.0.58 nvidia-curand-cu11 10.2.10.91 nvidia-cusolver-cu11 11.4.0.1 nvidia-cusparse-cu11 11.7.4.91 nvidia-nccl-cu11 2.14.3 nvidia-nvtx-cu11 11.7.91 packaging 23.0 pip 23.0.1 protobuf 4.22.1 psutil 5.9.4 pydantic 1.10.7 PyYAML 6.0 pyzmq 25.0.2 regex 2023.3.23 requests 2.28.2 roundrobin 0.0.4 safetensors 0.3.0 sentencepiece 0.1.97 setuptools 67.6.1 six 1.16.0 slowapi 0.1.8 sniffio 1.3.0 sse-starlette 1.3.3 starlette 0.26.1 sympy 1.11.1 tokenizers 0.13.3 torch 2.0.0 tqdm 4.65.0 transformers 4.28.0.dev0 triton 2.0.0 typing_extensions 4.5.0 urllib3 1.26.15 uvicorn 0.21.1 Werkzeug 2.2.3 wheel 0.40.0 wrapt 1.15.0 zipp 3.15.0 zope.event 4.6 zope.interface 6.0

JThh commented 1 year ago

Can you try downgrading transformers to 4.21.0?

balcklive commented 1 year ago

Can you try downgrading transformers to 4.21.0?

I tried, another bug happend: Traceback (most recent call last): File "/home/ubuntu/./ColossalAI/applications/Chat/inference/server.py", line 10, in from llama_gptq import load_quant File "/home/ubuntu/ColossalAI/applications/Chat/inference/llama_gptq/init.py", line 1, in from .loader import load_quant File "/home/ubuntu/ColossalAI/applications/Chat/inference/llama_gptq/loader.py", line 4, in from transformers import LlamaConfig, LlamaForCausalLM ImportError: cannot import name 'LlamaConfig' from 'transformers' (/opt/conda/envs/pytorch/lib/python3.9/site-packages/transformers/init.py)

gujingit commented 1 year ago

same problem

gujingit commented 1 year ago

Can you try downgrading transformers to 4.21.0?

I tried, another bug happend: Traceback (most recent call last): File "/home/ubuntu/./ColossalAI/applications/Chat/inference/server.py", line 10, in from llama_gptq import load_quant File "/home/ubuntu/ColossalAI/applications/Chat/inference/llama_gptq/init.py", line 1, in from .loader import load_quant File "/home/ubuntu/ColossalAI/applications/Chat/inference/llama_gptq/loader.py", line 4, in from transformers import LlamaConfig, LlamaForCausalLM ImportError: cannot import name 'LlamaConfig' from 'transformers' (/opt/conda/envs/pytorch/lib/python3.9/site-packages/transformers/init.py)

see https://github.com/underlines/awesome-marketing-datascience/issues/2

try pip uninstall transformers and then pip installing the one with pip install git+https://github.com/zphang/transformers.git@llama_push

JThh commented 1 year ago

Has the problem been fixed?

CarlGao4 commented 10 months ago

try pip uninstall transformers and then pip installing the one with pip install git+https://github.com/zphang/transformers.git@llama_push

This method fixed this issue

hpcaitech / ColossalAI

[BUG]:From server.py: ValueError: The following `model_kwargs` are not used by the model: ['token_type_ids'] #3499

🐛 Describe the bug

Environment