Closed llimllib closed 1 year ago
From the error message, it looks like the CUDA drivers are not installed.
Can you test if you can run successfully the commands nvidia-smi
and nvcc --version
?
Also, we already list flash_attn
among the suggested dependencies -- check the README!
We haven't tested the model on M1/M2 Macs yet, so in case of further blockers, I can recommend to run with the default attention implementation in PyTorch.
First of all, you need to install the latest versions of the following dependencies:
einops
sentencepiece
torch
transformers
is the section I read? If flash_attn
is listed, I don't see it
(an m1 mac has no nvidia card so I don't think I can install nvcc? Too bad, but I get that some stuff can't run without an nvidia card)
now I see that you listed it in the model description, but it appears to be necessary for inference as well, so it should be included in that list of required python packages is what I mean
You don't need flash attention for inference -- it's a "nice to have" that makes inference faster, but to my knowledge it works only on NVIDIA GPUs (as you need CUDA). In your case, you should load the model as indicated in the first half of that section:
from transformers import AutoModelForCausalLM
# load model
model = AutoModelForCausalLM.from_pretrained('replit/replit-code-v1-3b', trust_remote_code=True)
Hope this helps. Also, make sure to run on the latest version of the Transformers library!
That's exactly what I did that caused the error to occur!
Can you run pip install --upgrade transformers
, and try again?
I will do so tomorrow (I have to re-download the model now), but I was working in a clean virtualenv
(which I assume means pip will download the newest version of a lib? But maybe that assumption is false if there's a previously cached version?)
I'm unable to reproduce. Sincere apologies for the noise and wasting your time, and thanks for the model
No problem! Glad it worked in the end :)
I have mac m2 max 32 GB. pip install --upgrade transformers
has worked perfectly for me thanks @pirroh
for me it doesn't work with pip install --upgrade transformers
Trying to follow the instructions on an m1 mac, I get the above error.
Unfortunately, attempting to install
flash_attn
does not succeed, due to:RuntimeError: flash_attn was requested, but nvcc was not found.
, which may be just an unfortunate aspect of not having an nvidia card.Anyway, the point is probably you should add
flash_attn
to your list of required modules?