Closed yeontaek closed 11 months ago
Can you share the exact error you get? I didn't test with 70B, could be due to the GQA.
I've experimented with two models, namely the NousResearch and Meta's Llama 70b Model 2. However, I encountered a consistent error across both, as detailed below. Interestingly, I didn't experience any problems with the 7b model. I'm also considering testing other model sizes in the future to see if this issue persists.
File "/data0/yeontaek-oh/task/qlora/utils/llama_patch.py", line 47, in forward
key_states = self.k_proj(hidden_states).view(bsz, q_len, self.num_heads, self.head_dim).transpose(1, 2)
RuntimeError: shape '[6, 796, 64, 128]' is invalid for input of size 4890624
Does it work without flash attention?
It must be due to GQA of the bigger models. We would need to make changes to the llama_patch
I confirmed that it works without any problems in the qlora environment without flash. Also, I verified that there are no issues up to 13b.
Yeah its due to GQA, LAION already fixed that in that PR: https://github.com/LAION-AI/Open-Assistant/pull/3595/files Would be nice if you can take a look at what they did and open a PR.
Thank you. I'm currently testing the 13b model, and once that's done, I'll take a look and make adjustments.
Can you run this on a macbook pro m1 with 16 GB of RAM? I plan to use llama-2-7b-chat-hf...This is my last resort as nothing seems to work on my machine because nothing is compatible with M1.
Not sure about M1, but i found this guide: https://gist.github.com/cedrickchee/e8d4cb0c4b1df6cc47ce8b18457ebde0
Hello. The qlora and flash attention code has been very helpful.
I noticed that the code worked fine in llam2 7b, but there seems to be a shape issue in 70b. Is that correct?