Open jmtayamada opened 1 year ago
I did this on m1 but I didn't use hf=True. Did you run any test and did you install pytorch using the metal instructions? Because by default, I tested that installing ctransformers with mps support does not install pytorch.
I did this on m1 but I didn't use hf=True. Did you run any test and did you install pytorch using the metal instructions? Because by default, I tested that installing ctransformers with mps support does not install pytorch.
To use the tokenizer, hf has to equal True. Also, I've installed PyTorch with mps support and have checked using print(torch.backends.mps.is_available())
I might have made a mistake, you are right.
Anyways I checked the code for the loading for huggingface models, it doesn't seem like there was any moving to device. Perhaps we can wait for one of the developer's answer.
Update on this?
Same issue. Any Updates?
Hi I have the same issue. I can load the pipeline on maps, but I can't load the model on 'mps', but only a cpu. I followed these steps:
MODEL_NAME = "TheBloke/Llama-2-7B-32K-Instruct-GGUF"
llama_model = AutoModelForCausalLM.from_pretrained( MODEL_NAME, model_file="llama-2-7b-32k-instruct.Q5_K_S.gguf", model_type="llama", hf=True, gpu_layers = 50 )
But If I check the device llama_model.device
I have device(type='cpu')
.
Also if I try llama_model = llama_model.to('mps')
, if I check I have: device(type='cpu')
Any suggestion here in order to fix this issue, please? Thank you
I'm working on a Mac with an M2 chip. I've installed ctransformers with metal support, and am setting up the model like below. However, when I check what device the model is using, it outputs cpu.
Am I not setting up the model to use mps properly?