AttributeError: 'ChatCompletionResponse' object has no attribute 'model_dump_json'

jmanhype commented 11 months ago

ERROR: Exception in ASGI application Traceback (most recent call last): File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 408, in run_asgi result = await app( # type: ignore[func-returns-value] File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 84, in call return await self.app(scope, receive, send) File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/fastapi/applications.py", line 292, in call await super().call(scope, receive, send) File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/starlette/applications.py", line 122, in call await self.middleware_stack(scope, receive, send) File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/starlette/middleware/errors.py", line 184, in call raise exc File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/starlette/middleware/errors.py", line 162, in call await self.app(scope, receive, _send) File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/starlette/middleware/cors.py", line 83, in call await self.app(scope, receive, send) File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 79, in call raise exc File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 68, in call await self.app(scope, receive, sender) File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 20, in call raise e File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 17, in call await self.app(scope, receive, send) File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/starlette/routing.py", line 718, in call await route.handle(scope, receive, send) File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/starlette/routing.py", line 276, in handle await self.app(scope, receive, send) File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/starlette/routing.py", line 69, in app await response(scope, receive, send) File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/sse_starlette/sse.py", line 233, in call async with anyio.create_task_group() as task_group: File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 597, in aexit raise exceptions[0] File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/sse_starlette/sse.py", line 236, in wrap await func() File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/sse_starlette/sse.py", line 221, in stream_response async for data in self.body_iterator: File "/home/batman/dev/test1/Qwen/openai_api.py", line 419, in predict yield "{}".format(chunk.model_dump_json(exclude_unset=True)) AttributeError: 'ChatCompletionResponse' object has no attribute 'model_dump_json'

jmanhype commented 11 months ago

also this error seems to pop up ERROR: Exception in ASGI application Traceback (most recent call last): File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 408, in run_asgi result = await app( # type: ignore[func-returns-value] File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 84, in call return await self.app(scope, receive, send) File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/fastapi/applications.py", line 292, in call await super().call(scope, receive, send) File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/starlette/applications.py", line 122, in call await self.middleware_stack(scope, receive, send) File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/starlette/middleware/errors.py", line 184, in call raise exc File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/starlette/middleware/errors.py", line 162, in call await self.app(scope, receive, _send) File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/starlette/middleware/cors.py", line 83, in call await self.app(scope, receive, send) File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 79, in call raise exc File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 68, in call await self.app(scope, receive, sender) File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 20, in call raise e File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 17, in call await self.app(scope, receive, send) File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/starlette/routing.py", line 718, in call await route.handle(scope, receive, send) File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/starlette/routing.py", line 276, in handle await self.app(scope, receive, send) File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/starlette/routing.py", line 69, in app await response(scope, receive, send) File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/sse_starlette/sse.py", line 233, in call async with anyio.create_task_group() as task_group: File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 597, in aexit raise exceptions[0] File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/sse_starlette/sse.py", line 236, in wrap await func() File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/sse_starlette/sse.py", line 221, in stream_response async for data in self.body_iterator: File "/home/batman/dev/test1/Qwen/openai_api.py", line 432, in predict for new_response in response_generator: File "/home/batman/.cache/huggingface/modules/transformers_modules/QWen/QWen-7B-Chat-Int4/b725fe596dce755fe717c5b15e5c8243d5474f66/modeling_qwen.py", line 1273, in stream_generator for token in self.generate_stream( File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 35, in generator_context response = gen.send(None) File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/transformers_stream_generator/main.py", line 931, in sample_stream outputs = self( File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, kwargs) File "/home/batman/.cache/huggingface/modules/transformers_modules/QWen/QWen-7B-Chat-Int4/b725fe596dce755fe717c5b15e5c8243d5474f66/modeling_qwen.py", line 1108, in forward transformer_outputs = self.transformer( File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, *kwargs) File "/home/batman/.cache/huggingface/modules/transformers_modules/QWen/QWen-7B-Chat-Int4/b725fe596dce755fe717c5b15e5c8243d5474f66/modeling_qwen.py", line 938, in forward outputs = block( File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(args, kwargs) File "/home/batman/.cache/huggingface/modules/transformers_modules/QWen/QWen-7B-Chat-Int4/b725fe596dce755fe717c5b15e5c8243d5474f66/modeling_qwen.py", line 639, in forward attn_outputs = self.attn( File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/home/batman/.cache/huggingface/modules/transformers_modules/QWen/QWen-7B-Chat-Int4/b725fe596dce755fe717c5b15e5c8243d5474f66/modeling_qwen.py", line 564, in forward attn_output, attn_weight = self._attn( File "/home/batman/.cache/huggingface/modules/transformers_modules/QWen/QWen-7B-Chat-Int4/b725fe596dce755fe717c5b15e5c8243d5474f66/modeling_qwen.py", line 326, in _attn attn_weights = attn_weights / torch.full( torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.11 GiB (GPU 0; 11.73 GiB total capacity; 9.42 GiB already allocated; 819.75 MiB free; 10.68 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

JianxinMa commented 11 months ago

AttributeError: 'ChatCompletionResponse' object has no attribute 'model_dump_json'

Regarding the first error, please check if pip install "pydantic>=2.3.0" helps. Remember to include the double quotes.

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.11 GiB (GPU 0; 11.73 GiB total capacity; 9.42 GiB already allocated; 819.75 MiB free; 10.68 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

As for the second issue, Qwen-7B-Chat can consume around 14GB VRAM when handling a sequence of length 8192. Try reducing the sequence length by specifying python run_server.py --max_ref_token 1000.

jmanhype commented 11 months ago

thank you for the quick response agin. would streamingllm help the memory issue I understand that 14gb but does this framework would it benefit for implementing this https://github.com/mit-han-lab/streaming-llm

JianxinMa commented 11 months ago

I believe so. We are working on streaming LLM, though it may take some time. Please stay tuned.

jmanhype commented 11 months ago

Thank you. What about this https://x.com/arankomatsuzaki/status/1711401381247242683?s=20

jmanhype commented 11 months ago

QwenLM / Qwen-Agent

AttributeError: 'ChatCompletionResponse' object has no attribute 'model_dump_json' #22