Open balcklive opened 1 year ago
Can you try downgrading transformers
to 4.21.0?
Can you try downgrading
transformers
to 4.21.0?
I tried, another bug happend:
Traceback (most recent call last):
File "/home/ubuntu/./ColossalAI/applications/Chat/inference/server.py", line 10, in
same problem
Can you try downgrading
transformers
to 4.21.0?I tried, another bug happend: Traceback (most recent call last): File "/home/ubuntu/./ColossalAI/applications/Chat/inference/server.py", line 10, in from llama_gptq import load_quant File "/home/ubuntu/ColossalAI/applications/Chat/inference/llama_gptq/init.py", line 1, in from .loader import load_quant File "/home/ubuntu/ColossalAI/applications/Chat/inference/llama_gptq/loader.py", line 4, in from transformers import LlamaConfig, LlamaForCausalLM ImportError: cannot import name 'LlamaConfig' from 'transformers' (/opt/conda/envs/pytorch/lib/python3.9/site-packages/transformers/init.py)
see https://github.com/underlines/awesome-marketing-datascience/issues/2
try pip uninstall transformers
and then pip installing the one with pip install git+https://github.com/zphang/transformers.git@llama_push
Has the problem been fixed?
try
pip uninstall transformers
and then pip installing the one withpip install git+https://github.com/zphang/transformers.git@llama_push
This method fixed this issue
🐛 Describe the bug
I run my server with this: python3 ./ColossalAI/applications/Chat/inference/server.py /home/ubuntu/modelpath/llama-7b/llama-7b/ --quant 8bit --http_host 0.0.0.0 --http_port 8080
then I call the api with this: import requests
import request data = {"history": [{"instruction":"where is the capital of USA", "response":""}], "max_new_tokens": 150, "top_k": 30, "top_p": 0.5, "temperature": 0.6} response = requests.post("http://localhost:8080/generate/stream", json=data) print(response.text)
As result I got this:
Traceback (most recent call last): File "/opt/conda/envs/deploy/lib/python3.9/site-packages/uvicorn/protocols/http/h11_impl.py", line 429, in run_asgi result = await app( # type: ignore[func-returns-value] File "/opt/conda/envs/deploy/lib/python3.9/site-packages/uvicorn/middleware/proxy_headers.py", line 78, in call return await self.app(scope, receive, send) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/fastapi/applications.py", line 276, in call await super().call(scope, receive, send) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/starlette/applications.py", line 122, in call await self.middleware_stack(scope, receive, send) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/starlette/middleware/errors.py", line 184, in call raise exc File "/opt/conda/envs/deploy/lib/python3.9/site-packages/starlette/middleware/errors.py", line 162, in call await self.app(scope, receive, _send) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/starlette/middleware/cors.py", line 84, in call await self.app(scope, receive, send) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/starlette/middleware/exceptions.py", line 79, in call raise exc File "/opt/conda/envs/deploy/lib/python3.9/site-packages/starlette/middleware/exceptions.py", line 68, in call await self.app(scope, receive, sender) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/fastapi/middleware/asyncexitstack.py", line 21, in call raise e File "/opt/conda/envs/deploy/lib/python3.9/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in call await self.app(scope, receive, send) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/starlette/routing.py", line 718, in call await route.handle(scope, receive, send) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/starlette/routing.py", line 276, in handle await self.app(scope, receive, send) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/starlette/routing.py", line 66, in app response = await func(request) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/fastapi/routing.py", line 237, in app raw_response = await run_endpoint_function( File "/opt/conda/envs/deploy/lib/python3.9/site-packages/fastapi/routing.py", line 165, in run_endpoint_function return await run_in_threadpool(dependant.call, values) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/starlette/concurrency.py", line 41, in run_in_threadpool return await anyio.to_thread.run_sync(func, args) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/anyio/to_thread.py", line 31, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "/opt/conda/envs/deploy/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread return await future File "/opt/conda/envs/deploy/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 867, in run result = context.run(func, args) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/slowapi/extension.py", line 762, in sync_wrapper response = func(*args, kwargs) File "/home/ubuntu/./ColossalAI/applications/Chat/inference/server.py", line 118, in generate_no_stream output = model.generate(inputs, *data.dict(exclude={'history'})) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(args, kwargs) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/transformers/generation/utils.py", line 1231, in generate self._validate_model_kwargs(model_kwargs.copy()) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/transformers/generation/utils.py", line 1109, in _validate_model_kwargs raise ValueError( ValueError: The following
model_kwargs
are not used by the model: ['token_type_ids'] (note: typos in the generate arguments will also show up in this list)Does anybody know how to resolve this?
Environment
my package version is as below: Package Version
accelerate 0.18.0 anyio 3.6.2 bitsandbytes 0.37.2 Brotli 1.0.9 certifi 2022.12.7 charset-normalizer 3.1.0 click 8.1.3 cmake 3.26.1 ConfigArgParse 1.5.3 Deprecated 1.2.13 fastapi 0.95.0 filelock 3.11.0 Flask 2.2.3 Flask-BasicAuth 0.2.0 Flask-Cors 3.0.10 gevent 22.10.2 geventhttpclient 2.0.9 greenlet 2.0.2 h11 0.14.0 huggingface-hub 0.13.4 idna 3.4 importlib-metadata 6.2.0 importlib-resources 5.12.0 itsdangerous 2.1.2 jieba 0.42.1 Jinja2 3.1.2 limits 3.3.1 lit 16.0.0 locust 2.15.1 MarkupSafe 2.1.2 mpmath 1.3.0 msgpack 1.0.5 networkx 3.1 numpy 1.24.2 nvidia-cublas-cu11 11.10.3.66 nvidia-cuda-cupti-cu11 11.7.101 nvidia-cuda-nvrtc-cu11 11.7.99 nvidia-cuda-runtime-cu11 11.7.99 nvidia-cudnn-cu11 8.5.0.96 nvidia-cufft-cu11 10.9.0.58 nvidia-curand-cu11 10.2.10.91 nvidia-cusolver-cu11 11.4.0.1 nvidia-cusparse-cu11 11.7.4.91 nvidia-nccl-cu11 2.14.3 nvidia-nvtx-cu11 11.7.91 packaging 23.0 pip 23.0.1 protobuf 4.22.1 psutil 5.9.4 pydantic 1.10.7 PyYAML 6.0 pyzmq 25.0.2 regex 2023.3.23 requests 2.28.2 roundrobin 0.0.4 safetensors 0.3.0 sentencepiece 0.1.97 setuptools 67.6.1 six 1.16.0 slowapi 0.1.8 sniffio 1.3.0 sse-starlette 1.3.3 starlette 0.26.1 sympy 1.11.1 tokenizers 0.13.3 torch 2.0.0 tqdm 4.65.0 transformers 4.28.0.dev0 triton 2.0.0 typing_extensions 4.5.0 urllib3 1.26.15 uvicorn 0.21.1 Werkzeug 2.2.3 wheel 0.40.0 wrapt 1.15.0 zipp 3.15.0 zope.event 4.6 zope.interface 6.0