Closed rohhro closed 1 week ago
You'll have to install flash-attn
in both environments - also apologies on the delay
You'll have to install
flash-attn
in both environments - also apologies on the delay
Thanks! No worries!
I have 2 seperate conda environments, one has FA2, the other one doesn't. It's intentional, because I want to test the vRAM usage with or without FA2.
I have tested in both envs using the same Gemma 2 2B training script, the vRAM usages are the same.
That's why I filed this issue saying the FA2 is not working when finetune Gemma 2.
@rohhro we did a fix few months ago. Let us know if you're still encountring the issue!
I have a conda env where "FA2=True" and another env where "FA2=False" (as dispayed in the terminal when run the finetuning script), the vRAM usuable of tuning the same Gemma 2 model (2b or 9b) are the same, even in the script "attn_implementation = "flash_attention_2"" is presented.