Open tuanhe opened 6 months ago
I encountered a similar issue, and I was using version 4.38.2 of the transformers. I resolved the problem by downgrading it to version 4.32.0.
I encountered a similar issue, and I was using version 4.38.2 of the transformers. I resolved the problem by downgrading it to version 4.32.0.
It works, thanks very much
The same problem has been encountered in other repositories using llama2, thank you for your answer
I wanna reproduce the llama2 steps followed by the scripts/llama2_example.sh on RTX4090 I just run the commad
python -m awq.entry --model_path /data/models/Llama-2-7b-chat-hf --w_bit 4 --q_group_size 128 --run_awq --dump_awq awq_cache/Llama-2-7b-chat-hf-w4-g128.pt
it report the error :untimeError: The expanded size of the tensor (4608) must match the existing size (4096) at non-singleton dimension 3. Target sizes: [65, 32, 512, 4608]. Tensor sizes: [65, 1, 512, 4096]
here is the whole log infoThat I miss some steps ? or how can I fix it ?