ymcui / Chinese-LLaMA-Alpaca

中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki
Apache License 2.0
17.98k stars 1.84k forks source link

报错 #880

Closed mszjaas closed 5 months ago

mszjaas commented 6 months ago

提交前必须检查以下项目

问题类型

模型推理

基础模型

Alpaca-7B

操作系统

Linux

详细描述问题

正常merge了模型,运行下面代码开server,报错

python3 /home/mszjaas/lib/Chinese-LLaMA-Alpaca/scripts/openai_server_demo/openai_api_server.py --base_model ~/models/7b_model --only_cpu 

依赖情况(代码类问题务必提供)

# 请在此处粘贴依赖情况

运行日志或截图

[mszjaas@n1 Chinese-LLaMA-Alpaca]$ python3 scripts/openai_server_demo/openai_api_server.py --base_model /home/mszjaas/models/alpaca-combined-hf --only_cpu Xformers is not installed correctly. If you want to use memory_efficient_attention use the following command to install Xformers pip install xformers. USE_MEM_EFF_ATTENTION: False STORE_KV_BEFORE_ROPE: False Apply NTK scaling with ALPHA=1.0 /home/mszjaas/.local/lib/python3.11/site-packages/torch/cuda/init.py:107: UserWarning: CUDA initialization: The NVIDIA driver on your system is too old (found version 10010). Please update your GPU driver by downloading and installing a new version from the URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to: https://pytorch.org to install a PyTorch version that has been compiled with your version of the CUDA driver. (Triggered internally at ../c10/cuda/CUDAFunctions.cpp:109.) return torch._C._cuda_getDeviceCount() > 0 Loading checkpoint shards: 100%|██████████████████████████████████████| 3/3 [03:19<00:00, 66.37s/it] Vocab of the base model: 49953 Vocab of the tokenizer: 49953 2024-01-11 20:12:12,399 - INFO - Started server process [29923] 2024-01-11 20:12:12,630 - INFO - Waiting for application startup. 2024-01-11 20:12:12,631 - INFO - Application startup complete. 2024-01-11 20:12:12,632 - INFO - Uvicorn running on http://0.0.0.0:19327 (Press CTRL+C to quit) 2024-01-11 20:15:01,296 - INFO - 127.0.0.1:40768 - "POST /v1/chat/completions HTTP/1.1" 500 2024-01-11 20:15:01,297 - ERROR - Exception in ASGI application Traceback (most recent call last): File "/home/mszjaas/.local/lib/python3.11/site-packages/uvicorn/protocols/http/h11_impl.py", line 408, in run_asgi result = await app( # type: ignore[func-returns-value] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/mszjaas/.local/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 84, in call return await self.app(scope, receive, send) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/mszjaas/.local/lib/python3.11/site-packages/fastapi/applications.py", line 1054, in call await super().call(scope, receive, send) File "/home/mszjaas/.local/lib/python3.11/site-packages/starlette/applications.py", line 116, in call await self.middleware_stack(scope, receive, send) File "/home/mszjaas/.local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 186, in call raise exc File "/home/mszjaas/.local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 164, in call await self.app(scope, receive, _send) File "/home/mszjaas/.local/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 62, in call await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send) File "/home/mszjaas/.local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 55, in wrapped_app raise exc File "/home/mszjaas/.local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 44, in wrapped_app await app(scope, receive, sender) File "/home/mszjaas/.local/lib/python3.11/site-packages/starlette/routing.py", line 746, in call await route.handle(scope, receive, send) File "/home/mszjaas/.local/lib/python3.11/site-packages/starlette/routing.py", line 288, in handle await self.app(scope, receive, send) File "/home/mszjaas/.local/lib/python3.11/site-packages/starlette/routing.py", line 75, in app await wrap_app_handling_exceptions(app, request)(scope, receive, send) File "/home/mszjaas/.local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 55, in wrapped_app raise exc File "/home/mszjaas/.local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 44, in wrapped_app await app(scope, receive, sender) File "/home/mszjaas/.local/lib/python3.11/site-packages/starlette/routing.py", line 70, in app response = await func(request) ^^^^^^^^^^^^^^^^^^^ File "/home/mszjaas/.local/lib/python3.11/site-packages/fastapi/routing.py", line 299, in app raise e File "/home/mszjaas/.local/lib/python3.11/site-packages/fastapi/routing.py", line 294, in app raw_response = await run_endpoint_function( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/mszjaas/.local/lib/python3.11/site-packages/fastapi/routing.py", line 191, in run_endpoint_function return await dependant.call(values) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/mszjaas/lib/Chinese-LLaMA-Alpaca/scripts/openai_server_demo/openai_api_server.py", line 182, in create_chat_completion output = predict( ^^^^^^^^ File "/home/mszjaas/lib/Chinese-LLaMA-Alpaca/scripts/openai_server_demo/openai_api_server.py", line 136, in predict generation_output = model.generate( ^^^^^^^^^^^^^^^ File "/home/mszjaas/.local/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/home/mszjaas/.local/lib/python3.11/site-packages/transformers/generation/utils.py", line 1764, in generate return self.sample( ^^^^^^^^^^^^ File "/home/mszjaas/.local/lib/python3.11/site-packages/transformers/generation/utils.py", line 2861, in sample outputs = self( ^^^^^ File "/home/mszjaas/.local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/mszjaas/.local/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py", line 1181, in forward outputs = self.model( ^^^^^^^^^^^ File "/home/mszjaas/.local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/mszjaas/.local/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py", line 1068, in forward layer_outputs = decoder_layer( ^^^^^^^^^^^^^^ File "/home/mszjaas/.local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/mszjaas/.local/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py", line 796, in forward hidden_states, self_attn_weights, present_key_value = self.self_attn( ^^^^^^^^^^^^^^^ File "/home/mszjaas/.local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/mszjaas/lib/Chinese-LLaMA-Alpaca/scripts/openai_server_demo/patches.py", line 49, in xformers_forward kv_seq_len += past_key_value[0].shape[-2]


  File "/home/mszjaas/.local/lib/python3.11/site-packages/transformers/cache_utils.py", line 78, in __getitem__
    raise KeyError(f"Cache only has {len(self)} layers, attempted to access layer with index {layer_idx}")
KeyError: 'Cache only has 0 layers, attempted to access layer with index 0'
github-actions[bot] commented 6 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your consideration.

github-actions[bot] commented 5 months ago

Closing the issue, since no updates observed. Feel free to re-open if you need any further assistance.