Closed MabelQi closed 4 weeks ago
Same issue. Any progress here?
Thanks for reporting. Yes, this is a known issue that was introduced by introducing kv-cache to some model architectures in recent transformers versions, and that is affecting prefix tuning. We have a long discussion in #869 which also mentions some workarounds.
If this is an option for you, you could also try older transformers versions (e.g. 4.36.0 or older should work).
At the moment, I'm still figuring out how we can best make these recent transformers changes compatible with prefix-tuning, but unfortunately it's not an easy thing to fix.
Thx to your quick reply. @BenjaminBossan The workaround indeed works in my case. Yet, I found that the loss for prefix-tuning and p-tuning varies a lot on the same model and dataset.
For example, on Qwen2-1.5B and alpaca-cleaned, prefix-tuning yields ~10, while p-tuning yields ~1. Do you have any ideas on this phenomenon?
For example, on Qwen2-1.5B and alpaca-cleaned, prefix-tuning yields ~10, while p-tuning yields ~1. Do you have any ideas on this phenomenon?
Sorry, I don't have a lot of practical experience with these prompt tuning methods, maybe others can give some advise. Since the difference is so large, I would not exclude the possibility that there is a bug. Do you see that the training loss decreases? Did you try varying the hyper-parameters?
It could be worth a try to not use the workaround and instead checkout older transformers versions. If you see much better scores there, it is very likely that there is a bug in the workaround.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
System Info
I'm trying to PEFT with quantized LLMs. When I used prompt tuning, LoRA, and IA3, it works. However, when I use prefix tuning on 8-bit codellama-7b-hf, it reports the following error:
Who can help?
@BenjaminBossan @sayakpaul @tmm1
Information
Tasks
examples
folderReproduction
Expected behavior
I want to fine tune 8bit codellama-7b with prefix tuning