Closed galenyu closed 2 months ago
Hi @galenyu ,
Thanks for pointing out this issue and the info you provided! This problem was introduced by upgrading Transformers version in previous commit (11bae09). We have fixed this error in 7e3618b. Please let me know if the issue still exists.
Hi @galenyu ,
Thanks for pointing out this issue and the info you provided! This problem was introduced by upgrading Transformers version in previous commit (11bae09). We have fixed this error in 7e3618b. Please let me know if the issue still exists.
the version indeed causes this issue and now it works, thank you!
Hi there, I followed all the steps of this proj until I encountered an issue while running the following command.
python model/main.py decapoda-research-llama-7b-hf wikitext2 \ --wbits 4 --abits 4 --a_sym --w_sym \ --act_group_size 128 --weight_group_size 128 --weight_channel_group 2 \ --reorder --act_sort_metric hessian \ --a_clip_ratio 0.9 --w_clip_ratio 0.85 \ --keeper 128 --keeper_precision 3 --kv_cache --use_gptq \ --eval_ppl --eval_common_sense
Env
nvidia/cuda:11.3.1-cudnn8-devel-ubuntu20.04
requirements.txt
Describe the issue
When running loglikelihood requests, The typeError in the screenshot has occurred:
I tried to make changes based on this issue for
cache_position=None
intransformers/models/llama/modeling_llama.py
, but it also doesn't work.Any suggestions will be greatly appreciated!