Closed darebfh closed 8 months ago
Hi there,
you could try using
python generate/base.py --prompt "Hello, my name is" --precision 16-true
(I think the default is --precision bf16-true
, which is maybe why you are getting this error). Let me know if this works; we should probably update the code or documentation accordingly then.
Oops, my bad 😊 I'll shortly update the code that defines the default precision.
Hey, thanks for the quick reply!
Unfortunately, there's more stuff not implemented (yet?) on MPS.
Upon running python generate/base.py --prompt "Hello, my name is" --precision 16-true
I get the following error:
NotImplementedError: The operator 'aten::index_copy.out' is not currently implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on https://github.com/pytorch/pytorch/issues/77764. As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.
@darebfh Can you try rewriting https://github.com/Lightning-AI/lit-gpt/blob/main/lit_gpt/model.py#L237-L238 to do index_copy(
instead? (no underscore)
If this doesn't work, you can also try setting PYTORCH_ENABLE_MPS_FALLBACK=1
as the error message suggests
I tried on my mps
device with Radeon GPU and index_copy
didn't work:
device = "mps"
x = torch.zeros(5, 3, device=device)
t = torch.tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]], dtype=torch.float, device=device)
index = torch.tensor([0, 4, 2], device=device)
x.index_copy(0, index, t)
>> The operator 'aten::index_copy.out' is not currently implemented for the MPS device. ...
Surprisingly, PYTORCH_ENABLE_MPS_FALLBACK=1
also didn't work.
Curious, will it work on Apple Silicon.
Oh that's unfortunate. Does x[index] = t
work?
Oh that's unfortunate. Does
x[index] = t
work?
Yep, it works. Previously there was an issue on mps
devices with index copying, but not anymore:
https://github.com/pytorch/pytorch/issues/101936
Surprisingly,
PYTORCH_ENABLE_MPS_FALLBACK=1
also didn't work. Curious, will it work on Apple Silicon.
Yes it works, with 4,4 tokens/sec.
Since the issue with automatically applying bf16
precision for MPS
device is fixed in one of the latest commits, I think we can close the issue.
PYTORCH_ENABLE_MPS_FALLBACK=1
env variable is more like a fix for functions that are not yet supported, so it's a temporary item. I hesitate to even add info about it into README, since:
I agree. Thanks Andrei!
I followed the set-up guide to infer using the stablelm-base-alpha-3b.
Running
works as a charm.
However, upon trying to run the model with
python generate/base.py --prompt "Hello, my name is"
I getTypeError: BFloat16 is not supported on MPS
Obviously, I am working on a Apple M1 Max. Going through the tutorials, I did not find any additional requirements regarding running lit-gpt on Apple Silicon.