tloen / alpaca-lora

Instruct-tune LLaMA on consumer hardware
Apache License 2.0
18.68k stars 2.22k forks source link

RuntimeError: shape '[32, 2, 64, 4096]' is invalid for input of size 26214400 #590

Closed yourtiger closed 1 year ago

yourtiger commented 1 year ago

V100 32G GPU CUDA 1.8 Python3.10 Llama13B

I successfully executed :BASE_MODEL='/home/model' python3 export_hf_checkpoint.py Loading checkpoint shards: 100%|██████████████████| 3/3 [00:18<00:00, 6.07s/it] /usr/local/lib/python3.10/site-packages/transformers/generation/configuration_utils.py:362: UserWarning: do_sample is set to False. However, temperature is set to 0.9 -- this flag is only used in sample-based generation modes. You should set do_sample=True or unset temperature. This was detected when initializing the generation config instance, which means the corresponding file may ho ld incorrect parameterization and should be fixed. warnings.warn( /usr/local/lib/python3.10/site-packages/transformers/generation/configuration_utils.py:367: UserWarning: do_sample is set to False. However, top_p is setto 0.6 -- this flag is only used in sample-based generation modes. You should set do_sample=True or unset top_p. This was detected when initializing the generation config instance, which means the corresponding file may hold incorrect parameterization and should be fixed. warnings.warn( /usr/local/lib/python3.10/site-packages/transformers/generation/configuration_utils.py:508: UserWarning: The generation config instance is invalid -- .validate() throws warnings and/or exceptions. Fix these issues to save the configuratio n. This warning will be raised to an exception in v4.34.

Thrown during validation: do_sample is set to False. However, temperature is set to 0.9 -- this flag is only used in sample-based generation modes. You should set do_sample=True or unset temperature. warnings.warn( image image

But I failed to execute the following code:BASE_MODEL='/home/model' python3 export_state_dict_checkpoint.py root@iZwz9cc5vwcplget5j9bzdZ:/home/alpaca-lora# BASE_MODEL='/home/model' python3 export_state_dict_checkpoint.py Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:17<00:00, 5.95s/it] /usr/local/lib/python3.10/site-packages/transformers/generation/configuration_utils.py:362: UserWarning: do_sample is set to False. However, temperature is set to 0.9 -- this flag is only used in sample-based generation modes. You should set do_sample=True or unset temperature. This was detected when initializing the generation config instance, which means the corresponding file may hold incorrect parameterization and should be fixed. warnings.warn( /usr/local/lib/python3.10/site-packages/transformers/generation/configuration_utils.py:367: UserWarning: do_sample is set to False. However, top_p is set to 0.6 -- this flag is only used in sample-based generation modes. You should set do_sample=True or unset top_p. This was detected when initializing the generation config instance, which means the corresponding file may hold incorrect parameterization and should be fixed. warnings.warn( Traceback (most recent call last): File "/home/alpaca-lora/export_state_dict_checkpoint.py", line 116, in new_state_dict[new_k] = unpermute(v) File "/home/alpaca-lora/export_state_dict_checkpoint.py", line 67, in unpermute w.view(n_heads, 2, dim // n_heads // 2, dim) RuntimeError: shape '[32, 2, 64, 4096]' is invalid for input of size 26214400

please help me,thanks

yourtiger commented 1 year ago

I know the reason, because I am using Llama2's 13B model, so I should rewrite the params.json of the 13B model to export State Dict_ In the checkpoint. py file