api_server.py 无法并发访问uvicorn.run(app, host='0.0.0.0', port=8000, workers=1)

System Info / 系統信息

| NVIDIA-SMI 550.67 Driver Version: 550.67 CUDA Version: 12.4
transformers 4.39.1

Who can help? / 谁可以帮助到您？

No response

Information / 问题信息

[X] The official example scripts / 官方的示例脚本
[ ] My own modified scripts / 我自己修改的脚本和任务

Reproduction / 复现过程

api_server.py 无法并发访问原始文件 uvicorn.run(app, host='0.0.0.0', port=8000, workers=1) 根据网上查找资料说改workers为提高并发能力，于是尝试修改 uvicorn.run("api_server:app", host='0.0.0.0', port=8000, workers=4)。程序可以正常启动，但是推理报错如下(ChatGLM) root@root1-System-Product-Name:/home/ChatGLM/ChatGLM3/openai_api_demo# python api_server.py Setting eos_token is not supported, use the default one. Setting pad_token is not supported, use the default one. Setting unk_token is not supported, use the default one. Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:01<00:00, 5.82it/s] No sentence-transformers model found with name /home/ChatGLM/M3E-large. Creating a new one with MEAN pooling. INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit) INFO: Started parent process [18764] INFO: Started server process [18859] INFO: Waiting for application startup. INFO: Application startup complete. INFO: Started server process [18856] INFO: Waiting for application startup. INFO: Application startup complete. INFO: Started server process [18858] INFO: Waiting for application startup. INFO: Application startup complete. INFO: Started server process [18857] INFO: Waiting for application startup. INFO: Application startup complete. 2024-04-23 09:47:30.801 | DEBUG | api_server:create_chat_completion:239 - ==== request ==== {'messages': [ChatMessage(role='user', content='你好', name=None, function_call=None), ChatMessage(role='assistant', content='您好，我是 HitoGPT，一个由海投打造的人工智能助手。请问有什么我可以帮您的吗？', name=None, function_call=None), ChatMessage(role='user', content='你好', name=None, function_call=None)], 'temperature': 0.01, 'top_p': 0.8, 'max_tokens': 4000, 'echo': False, 'stream': True, 'repetition_penalty': 1.1, 'tools': None} INFO: 172.18.0.2:33374 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error ERROR: Exception in ASGI application Traceback (most recent call last): File "/root/anaconda3/envs/ChatGLM/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 407, in run_asgi result = await app( # type: ignore[func-returns-value] File "/root/anaconda3/envs/ChatGLM/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 69, in call return await self.app(scope, receive, send) File "/root/anaconda3/envs/ChatGLM/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in call await super().call(scope, receive, send) File "/root/anaconda3/envs/ChatGLM/lib/python3.10/site-packages/starlette/applications.py", line 123, in call await self.middleware_stack(scope, receive, send) File "/root/anaconda3/envs/ChatGLM/lib/python3.10/site-packages/starlette/middleware/errors.py", line 186, in call raise exc File "/root/anaconda3/envs/ChatGLM/lib/python3.10/site-packages/starlette/middleware/errors.py", line 164, in call await self.app(scope, receive, _send) File "/root/anaconda3/envs/ChatGLM/lib/python3.10/site-packages/starlette/middleware/cors.py", line 83, in call await self.app(scope, receive, send) File "/root/anaconda3/envs/ChatGLM/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 62, in call await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send) File "/root/anaconda3/envs/ChatGLM/lib/python3.10/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app raise exc File "/root/anaconda3/envs/ChatGLM/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app await app(scope, receive, sender) File "/root/anaconda3/envs/ChatGLM/lib/python3.10/site-packages/starlette/routing.py", line 758, in call await self.middleware_stack(scope, receive, send) File "/root/anaconda3/envs/ChatGLM/lib/python3.10/site-packages/starlette/routing.py", line 778, in app await route.handle(scope, receive, send) File "/root/anaconda3/envs/ChatGLM/lib/python3.10/site-packages/starlette/routing.py", line 299, in handle await self.app(scope, receive, send) File "/root/anaconda3/envs/ChatGLM/lib/python3.10/site-packages/starlette/routing.py", line 79, in app await wrap_app_handling_exceptions(app, request)(scope, receive, send) File "/root/anaconda3/envs/ChatGLM/lib/python3.10/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app raise exc File "/root/anaconda3/envs/ChatGLM/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app await app(scope, receive, sender) File "/root/anaconda3/envs/ChatGLM/lib/python3.10/site-packages/starlette/routing.py", line 74, in app response = await func(request) File "/root/anaconda3/envs/ChatGLM/lib/python3.10/site-packages/fastapi/routing.py", line 278, in app raw_response = await run_endpoint_function( File "/root/anaconda3/envs/ChatGLM/lib/python3.10/site-packages/fastapi/routing.py", line 191, in run_endpoint_function return await dependant.call(values) File "/home/ChatGLM/ChatGLM3/openai_api_demo/api_server.py", line 245, in create_chat_completion output = next(predict_stream_generator) File "/home/ChatGLM/ChatGLM3/openai_api_demo/api_server.py", line 421, in predict_stream for new_response in generate_stream_chatglm3(model, tokenizer, gen_params): NameError: name 'model' is not defined 2024-04-23 09:47:31.239 | DEBUG | api_server:create_chat_completion:239 - ==== request ==== {'messages': [ChatMessage(role='user', content='你好', name=None, function_call=None), ChatMessage(role='assistant', content='您好，我是 HitoGPT，一个由海投打造的人工智能助手。请问有什么我可以帮您的吗？', name=None, function_call=None), ChatMessage(role='user', content='你好', name=None, function_call=None)], 'temperature': 0.01, 'top_p': 0.8, 'max_tokens': 4000, 'echo': False, 'stream': True, 'repetition_penalty': 1.1, 'tools': None} INFO: 172.18.0.2:33386 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error ERROR: Exception in ASGI application Traceback (most recent call last): File "/root/anaconda3/envs/ChatGLM/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 407, in run_asgi result = await app( # type: ignore[func-returns-value] File "/root/anaconda3/envs/ChatGLM/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 69, in call return await self.app(scope, receive, send) File "/root/anaconda3/envs/ChatGLM/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in call await super().call(scope, receive, send) File "/root/anaconda3/envs/ChatGLM/lib/python3.10/site-packages/starlette/applications.py", line 123, in call await self.middleware_stack(scope, receive, send) File "/root/anaconda3/envs/ChatGLM/lib/python3.10/site-packages/starlette/middleware/errors.py", line 186, in call raise exc File "/root/anaconda3/envs/ChatGLM/lib/python3.10/site-packages/starlette/middleware/errors.py", line 164, in call await self.app(scope, receive, _send) File "/root/anaconda3/envs/ChatGLM/lib/python3.10/site-packages/starlette/middleware/cors.py", line 83, in call await self.app(scope, receive, send) File "/root/anaconda3/envs/ChatGLM/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 62, in call await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send) File "/root/anaconda3/envs/ChatGLM/lib/python3.10/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app raise exc File "/root/anaconda3/envs/ChatGLM/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app await app(scope, receive, sender) File "/root/anaconda3/envs/ChatGLM/lib/python3.10/site-packages/starlette/routing.py", line 758, in call await self.middleware_stack(scope, receive, send) File "/root/anaconda3/envs/ChatGLM/lib/python3.10/site-packages/starlette/routing.py", line 778, in app await route.handle(scope, receive, send) File "/root/anaconda3/envs/ChatGLM/lib/python3.10/site-packages/starlette/routing.py", line 299, in handle await self.app(scope, receive, send) File "/root/anaconda3/envs/ChatGLM/lib/python3.10/site-packages/starlette/routing.py", line 79, in app await wrap_app_handling_exceptions(app, request)(scope, receive, send) File "/root/anaconda3/envs/ChatGLM/lib/python3.10/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app raise exc File "/root/anaconda3/envs/ChatGLM/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app await app(scope, receive, sender) File "/root/anaconda3/envs/ChatGLM/lib/python3.10/site-packages/starlette/routing.py", line 74, in app response = await func(request) File "/root/anaconda3/envs/ChatGLM/lib/python3.10/site-packages/fastapi/routing.py", line 278, in app raw_response = await run_endpoint_function( File "/root/anaconda3/envs/ChatGLM/lib/python3.10/site-packages/fastapi/routing.py", line 191, in run_endpoint_function return await dependant.call(values) File "/home/ChatGLM/ChatGLM3/openai_api_demo/api_server.py", line 245, in create_chat_completion output = next(predict_stream_generator) File "/home/ChatGLM/ChatGLM3/openai_api_demo/api_server.py", line 421, in predict_stream for new_response in generate_stream_chatglm3(model, tokenizer, gen_params): NameError: name 'model' is not defined

Expected behavior / 期待表现

希望官方可以出一个允许并发访问的脚本demo

THUDM / ChatGLM3