Open zhuchuji opened 2 months ago
By using MPS(Metal Performance Shaders) on my mac and modify AutoPeftModelForCausalLM.from_pretrained( checkpoint_path, device_map="cuda", trust_remote_code=True, fp16=True ).eval()
to AutoPeftModelForCausalLM.from_pretrained( checkpoint_path, device_map=torch.device('mps'), trust_remote_code=True, fp16=True ).eval()
, the error becoming the following:
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████| 10/10 [01:32<00:00, 9.23s/it]
Traceback (most recent call last):
File "/Users/chockiezhu/practice/CLoT/inference.py", line 9, in
Our testing was conducted on Linux. Due to a lack of Apple Mac devices, we're unable to address your query. Here is a suggestion.
"Out of memory" errors often occur due to insufficient GPU memory. You can try using some methods for transformers
(like those outlined in https://huggingface.co/docs/accelerate/en/usage_guides/big_modeling) to resolve the issue.
I try to run the program on my macbook pro with M2 Max CPU, it throws AssertionError: Torch not compiled with CUDA enabled.
Detail log is shown as following: