Open HarryK4673 opened 1 month ago
I had this error lately and what worked for me was uninstalling flash_attn and reinstalling it again but without using cached version. I did also verify the transformers version the LLaMA was built around(4.45.0 for new llama models), but flash attention and mismatch in cuda and installed PyTorch seems to the issue there.
pip uninstall flash-attn
pip install flash-attn --no-cache-dir
pip install flash-attn --no-cache-dir
Thanks. It works now.
Hello everyone!
I'm currently working on an assignment from uni and I need to fine-tuning a model. However, the model use this library to do that. When I run the fine-tuning script, there's an error like this:
I'm using a server with A10G GPU. The driver version is 535.183.01 and CUDA version is 12.2, Pytorch version 2.1.2 with cu121. It seems I cannot install CUDA 12.1 on this driver (it needs driver 530, but I cannot install in on the server). Could anyone help me with this? Thanks a lot!