Getting error: KeyError: 'Cache only has 0 layers, attempted to access layer with index 0'

manishparanjape commented 10 months ago

Discussed in https://github.com/orgs/eosphoros-ai/discussions/1026

^{Originally posted by **manishparanjape** January 4, 2024} Followed these instructions: https://docs.dbgpt.site/docs/installation/model_service/stand_alone MAc details : Apple M2 Max (64 GB) ``` dbgpt model list +-----------------+------------+---------------+------+---------+---------+-----------------+----------------------------+ | Model Name | Model Type | Host | Port | Healthy | Enabled | Prompt Template | Last Heartbeat | +-----------------+------------+---------------+------+---------+---------+-----------------+----------------------------+ | vicuna-13b-v1.5 | llm | 192.168.86.29 | 6006 | True | True | | 2024-01-04T10:13:26.627204 | | WorkerManager | service | 192.168.86.29 | 6006 | True | True | | 2024-01-04T10:13:26.627320 | +-----------------+------------+---------------+------+---------+---------+-----------------+----------------------------+ dbgpt model chat --model_name vicuna-13b-v1.5 Chatbot started with model vicuna-13b-v1.5. Type 'exit' to leave the chat. You: Hi Bot: **LLMServer Generate Error, Please CheckErrorInfo.**: 'Cache only has 0 layers, attempted to access layer with index 0' ERROR: dbgpt.model.cluster.worker.default_worker[3654] ERROR Model inference error, detail: Traceback (most recent call last): File "/Users/mpla/Projects/text2sql/DB-GPT/dbgpt/model/cluster/worker/default_worker.py", line 154, in generate_stream for output in generate_stream_func( File "/Users/mpla/anaconda3/envs/dbgpt_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 35, in generator_context response = gen.send(None) File "/Users/mpla/anaconda3/envs/dbgpt_env/lib/python3.10/site-packages/fastchat/serve/inference.py", line 132, in generate_stream out = model(input_ids=start_ids, use_cache=True) File "/Users/mpla/anaconda3/envs/dbgpt_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/Users/mpla/anaconda3/envs/dbgpt_env/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 1181, in forward outputs = self.model( File "/Users/mpla/anaconda3/envs/dbgpt_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/Users/mpla/anaconda3/envs/dbgpt_env/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 1068, in forward layer_outputs = decoder_layer( File "/Users/mpla/anaconda3/envs/dbgpt_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/Users/mpla/anaconda3/envs/dbgpt_env/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 796, in forward hidden_states, self_attn_weights, present_key_value = self.self_attn( File "/Users/mpla/anaconda3/envs/dbgpt_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/Users/mpla/Projects/text2sql/DB-GPT/dbgpt/model/llm/monkey_patch.py", line 58, in forward kv_seq_len += past_key_value[0].shape[-2] File "/Users/mpla/anaconda3/envs/dbgpt_env/lib/python3.10/site-packages/transformers/cache_utils.py", line 78, in __getitem__ raise KeyError(f"Cache only has {len(self)} layers, attempted to access layer with index {layer_idx}") KeyError: 'Cache only has 0 layers, attempted to access layer with index 0' ```

fangyinc commented 10 months ago

Hi, @manishparanjape, thinks for your feedback, we will fix later. Now, you can try to downgrade the transformers version (less than 4.35.0).

pip uninstall transformers -y && pip install transformers==4.34.1

manishparanjape commented 10 months ago

@fangyinc Thanks for the quick reply. Your suggestion worked, but now I am getting a different error. (Please see below). Is this related?

dbgpt model chat --model_name vicuna-13b-v1.5
Chatbot started with model vicuna-13b-v1.5. Type 'exit' to leave the chat.

You: hi
Bot: **LLMServer Generate Error, Please CheckErrorInfo.**: forward() got an unexpected keyword argument 'padding_mask'

fangyinc commented 10 months ago

@manishparanjape I already fixed in #1033

eosphoros-ai / DB-GPT

Getting error: KeyError: 'Cache only has 0 layers, attempted to access layer with index 0' #1030

Discussed in https://github.com/orgs/eosphoros-ai/discussions/1026