Open wacdev opened 1 year ago
model = AutoModel.from_pretrained("THUDM/visualglm-6b", trust_remote_code=True).half().cuda() 改为
model = AutoModel.from_pretrained("THUDM/visualglm-6b", trust_remote_code=True).half().to(“mps”)
Traceback (most recent call last):
File "/Users/longkeyy/PycharmProjects/hf_demo/llm.py", line 4, in
看上去你使用了量化,目前只支持cuda上的量化。
能参考 stable-diffusion-webui 在官方代码调整下让在mps上可以跑吗?我不太懂机器学习,不知道怎么改
https://github.com/AUTOMATIC1111/stable-diffusion-webui/blob/master/modules/devices.py
if sys.platform == "darwin":
from modules import mac_specific
def has_mps() -> bool:
if sys.platform != "darwin":
return False
else:
return mac_specific.has_mps
def extract_device_id(args, name):
for x in range(len(args)):
if name in args[x]:
return args[x + 1]
return None
def get_cuda_device_string():
from modules import shared
if shared.cmd_opts.device_id is not None:
return f"cuda:{shared.cmd_opts.device_id}"
return "cuda"
def get_optimal_device_name():
if torch.cuda.is_available():
return get_cuda_device_string()
if has_mps():
return "mps"
return "cpu"
def get_optimal_device():
return torch.device(get_optimal_device_name())
https://github.com/AUTOMATIC1111/stable-diffusion-webui/blob/master/modules/mac_specific.py
运行python web_demo.py 用cpu跑错误是 "slow_conv2d_cpu" not implemented for 'Half'
用mps跑是
loc("varianceEps"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/97f6331a-ba75-11ed-a4bc-863efbbaf80d/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":228:0)):
error: input types 'tensor<1x257x1xf16>' and 'tensor<1xf32>' are not broadcast compatible
更新torch到2.1后是能用fp16和mps跑的,但是貌似有内存泄露,问一个问题后内存就从18G涨到28G,swap一用就扛不住了。
pip list|grep torch
torch 2.1.0.dev20230606
torchaudio 2.1.0.dev20230606
torchvision 0.16.0.dev20230606