Closed fakerybakery closed 9 months ago
CPU could be supported through whisper.cpp/llama.cpp but we are not working on that right now. MPS should work with minimal tweaks (there may be some hardcoded “cuda” settings).
Nice, thanks. Do you know how much work it would take to get WhisperSpeech working with whisper.cpp?
Adding my vote for MPS support: I'd love to use this on Macs and iOS devices.
Not sure if you can run Python on iOS w/o iSH
@fakerybakery You can try and report back how difficult it is :)
I don't have this on my roadmap right now (I am mostly focused on improving quality and language coverage right now) but, if someone needs this, a consulting contract is a very effective way to make sure it happens.
Would be great if someone add MPS support. Can't run this on Mac, and mac's are quite often used with LLMs now
CPU could be supported through whisper.cpp/llama.cpp but we are not working on that right now. MPS should work with minimal tweaks (there may be some hardcoded “cuda” settings).
I might take this one on...but first please see my recent issue about pull requests and whether you're open to source code modifications without me using a Jupyter Notebook...unless someone wants to show me how.
Basically, I'd be considering tackling:
1) ensuring AMD GPU-acceleration on Linux via rocM (unfortunately, pytorch doesn't support AMD GPUs on Windows) ---This should involve minimal changes since it uses the "cuda" device within the pytorch framework, so it'd just be a matter of doublechecking the code actually for minor changes.
2) ensuring MPS support, which, again, involves minor changes (adding "mps" as a vible device within pytorch).
3) likely adding source code-wide changes to use "cuda," "mps" or "cpu" as the default compute device depending on a user's system.
Just left a response on https://github.com/collabora/WhisperSpeech/issues/73 would be great to have MPS support.
@BBC-Esq we are using nbdev. it allows you to edit either the notebooks or the .py
files and later synchronize the changes.
I am on holiday next week but afterwards I am happy to either help you setup nbdev or if you make a PR I can merge your changes back into the notebooks.
modifying Whisperspeech to run on torch MPS backend was not so hard - just replaced .cuda() with .to("mps"), added map_location='mps' to couple torch.loads and removed 'with sdp_kernel' lines. But i hit some problem with vocoder - MPS doesnt have real x complex GEMMs(some assert) and no complex.out is implemented for MPS so need a little bit of help here
Here's the pull request I did as well. Want to work together on this? https://github.com/collabora/WhisperSpeech/pull/77 I'm not that familiar with github, but I think there's a way to work together on a pull request?
did you get it working - i did more changes and still wasnt able to run inference exampl notebook BTW all those .py files are generated from notebooks, so need to modify those as well
No, the pull request was simply to show an example of choosing between "cuda," "mps" or "cpu" based on the get_compute_device function within utils.py
. I was hoping to get feedback as far as that approach in general (a function that dynamically determines the compute device) before modifying the other scripts. Multiple other scripts will need to be modified to set the appropriate compute device dynamically if the developer approves this approach, basically.
Also, now we're aware of the issue that you raised regarding vocoder above. Was hoping to get the "go ahead" beforehand, basically. If you want to work on this together, I'm assuming we'd work on the branch I created (from which the pull request came from)? Kind of new to github...
@jpc What did you think of the draft pull request. Am I on the right track and do you want me to work on modifying the other scripts as well?
Regarding Vocos and MPS maybe it would be worth raising an issue on their GitHub and see what the author says? I was using this model as-is so I am unfortunately not familiar with its internals.
If this does not help I can try looking into this next week.
The sdp_kernel is kind of important for performance on CUDA so we’d have to figure out how to make them transparent for MPS. Maybe make a new context manager that wraps the one from PyTorch?
I'll do what I can on the draft pull request, but others will likely have to help since I don't have a MacOS to test on...I can at least get the overall framework there in terms of dynamically choosing the compute deivce across all scripts...
ok i got it to work on Mac but had to move vocoder and encoder to cpu. MPS lacks support for The operator 'aten::complex.out' is not currently implemented for the MPS device. The operator 'aten::_fft_r2c' is not currently implemented for the MPS device
Excellent, so we've whittled it down. Can you send a screen shot of trying to put it via mps anyways? That way I can see what the error says and try to troubleshoot. But with my revised scripts (i.e. draft pull request) MPS works for everything except the vocoder? Thanks.
I was able to find this. https://qqaatw.dev/pytorch-mps-ops-coverage/ I couldn't find fft_r2c on there though.
sorry i didn't use your pull request, just some hacked together code(which is quite similar but in more places). Need to have something working first i thought. Haven't you tried running on MPS yourself? i posted couple of requests to https://github.com/pytorch/pytorch/issues/77764
Unfortunately I don't have an Apple computer...nor Linux for that matter. That's an extreme challenge when trying to write code that works with all three platforms for sure. I was able to find these links, however:
https://github.com/pytorch/pytorch/pull/116630 https://developer.apple.com/documentation/metal/metal_sample_code_library/customizing_a_pytorch_operation https://github.com/neuraloperator/neuraloperator
Not sure if they'll help.
My draft pull request has all the basic infrastructure there though, suppose we could modify it to exclude the vocoder from being loaded on MPS alone, but I'd like to hear back from the repository owner if he can confirm that you've said so we know for certain ya know?
I was thinking about writing to the Vocos author since I believe sometimes the offending operations can be changed to something a little bit different that works out of the box on MPS.
Do it! @akorzh do you have the script you used? Might help me troubleshoot.
@jpc A few possible workarounds if we can't find a way to get vocos working on MPS out of the box...
1) Manually implement the GEMMs or specific FFT operations using MPS primitives.
2) Decompose the unsupported operations into smaller supported operations.
3) Context Manager to automatically move operations to CPU/MPS when appropriate to ensure that as much as possible will run on MPS.
4) Write custom kernels in the metal shading language and invoke them from python with PyObjC.
5) Evaluate how MPS Graph within Core ML might help.
6) Possibly use SYCL and DPC++ to write code that is portable across different GPU architectures, including potentially targeting Metal through an abstraction layer. Primarily designed for CUDA and OpenCL, could potentially be adapted to generate MSL code that runs on MPS.
7) Using OpenCL/GL instead of MPS as a fallback rather than falling back to the CPU.
Thoughts anyone?
Another option might be to use Vulkan. Llama.cpp just implemented a Vulkan backend, one version from gpt4All and another from another guy (forget his named). This would also allow gpu acceleration with AMD gpus on Windows and, according to the following link, on MacOS as well:
@jpc and @akorzh I think I may have found a solution. MLX for MacOS? Here's the operations it supports:
Here's the website link:
https://ml-explore.github.io/mlx/build/html/python/fft.html https://github.com/ml-explore/mlx
Take it with a grain of salt, but here's what gpt-4 says...so there might be an option optimized for apple already...I leave it to your expertise:
SEE ALSO HERE FOR MORE DETAIL:
gpt-4 says they're the same...also ran through gpt the pytorch description here:
https://pytorch.org/cppdocs/api/function_namespaceat_1aaea819b1367e99c6ef062ac8335edba2.html
hey crew, I spent a few hours last night and today working on both CPU and MPS updates to this codebase. I also ran into the same results as @akorzh , except that I didn't get it to run. Instead, attempting to keep everything on the CPU I ran into the "addmm_impl_cpu_" not implemented for 'Half'
message inside the MultiHeadAttention.forward
call. Perhaps it has to do with my environment running pytorch version '2.1.1' at the time of testing.
I spent time with the [spd_kernel](https://github.com/collabora/WhisperSpeech/blob/80b268b74900b2f7ca7a36a3c789607a3f4cd912/whisperspeech/s2a_delar_mup_wds_mlang.py#L500)
line without a solution yet. To my understanding pytorch hasn't implemented Flash Attention, but there was an implementation at https://github.com/philipturner/metal-flash-attention.
Moving past that, I think if we can use functions from the Vulkan or the MLX library, like @BBC-Esq pointed out, it would be best. I've not yet worked with these projects yet so a lot is unfamiliar.
patch.txt here is my patch which works on mac (runs mps and cpu for the rest)
@akorzh nice! any plans for a PR?
thanks @akorzh , can confirm those updates worked here too
patch.txt here is my patch which works on mac (runs mps and cpu for the rest)
@akorzh Below is a summary of the locations where you changed lines from cuda to mps. With your permission, I'd like to modify my pull request to add these changes, but making them dynamic. In other words, the new function in utils.py would determine the "compute_device". If "cuda", all of the locations would continue to use "cuda" dynamically. If "mps" is the available compute device, "mps" would be used everywhere except the locations that require CPU; specifically:
A2WAV.PY
self.vocos = Vocos.from_pretrained(repo_id).cuda()
PIPELINE.PY
run_opts={"device": "cuda"})
This would enable dynamic choosing of the appropriate compute device for both CUDA or MPS...and we can also add CPU for all if we want to include that as an option for people...It's my understanding that torch.set_default_device()
doesn't accept "cpu" because CPU is default under PyTorch...Anyways, here's the outline of the lines I'd focus on:
benchmark.py
Original: - torch.cuda.synchronize()
Modified: + torch.mps.synchronize()
Original: - pipe.t2s.decoder.mask = torch.empty(t2s_ctx_n, t2s_ctx_n).fill_(-torch.inf).triu_(1).cuda()
Modified: + pipe.t2s.decoder.mask = torch.empty(t2s_ctx_n, t2s_ctx_n).fill_(-torch.inf).triu_(1).to("mps")
Original: - pipe.s2a.decoder.mask = torch.empty(s2a_ctx_n, s2a_ctx_n).fill_(-torch.inf).triu_(1).cuda()
Modified: + pipe.s2a.decoder.mask = torch.empty(s2a_ctx_n, s2a_ctx_n).fill_(-torch.inf).triu_(1).to("mps")
extract_acoustic.py
Original: - return _tform(x).cuda().unsqueeze(0)
Modified: + return _tform(x).to("mps").unsqueeze(0)
Original: - model.cuda().eval();
Modified: + model.to("mps").eval();
extract_spk_emb.py
Original: - run_opts={"device": "cuda"})
Modified: + run_opts={"device": "mps"})
extract_stoks.py
Original: - vq_model = vq_stoks.RQBottleneckTransformer.load_model(vq_model).cuda()
Modified: + vq_model = vq_stoks.RQBottleneckTransformer.load_model(vq_model).to("mps")
Original: - run_opts={"device": "cuda"})
Modified: + run_opts={"device": "mps"})
Original: - samples16k = samples16k.cuda().to(torch.float16)
Modified: + samples16k = samples16k.to("mps").to(torch.float16)
pipeline.py
Original: - self.t2s = TSARTransformer.load_model(**args).cuda()
Modified: + self.t2s = TSARTransformer.load_model(**args).to("mps")
Original: - self.s2a = SADelARTransformer.load_model(**args).cuda()
Modified: + self.s2a = SADelARTransformer.load_model(**args).to("mps")
prepare_s2a_atoks.py
Original: - csamples = samples.cuda().unsqueeze(1)
Modified: + csamples = samples.to("mps").unsqueeze(1)
prepare_t2s_txts.py
Original: - model_size, "cuda", compute_type="float16", language=lang,
Modified: + model_size, "mps", compute_type="float16", language=lang,
Original: - csamples = samples.cuda()
Modified: + csamples = samples.to("mps")
s2a_delar_mup_wds_mlang.py
Original: - self.register_buffer('val_true', torch.zeros(self.quantizers).cuda())
Modified: + self.register_buffer('val_true', torch.zeros(self.quantizers).to("mps"))
Original: - self.register_buffer('val_total', torch.zeros(self.quantizers).cuda())
Modified: + self.register_buffer('val_total', torch.zeros(self.quantizers).to("mps"))
Original: - spec = torch.load(local_filename)
Modified: + spec = torch.load(local_filename,map_location='mps')
t2s_up_wds_mlang_enclm.py
Original: - spec = torch.load(local_filename)
Modified: + spec = torch.load(local_filename,map_location='mps')
vad.py
Original: - vad_model = whisperx.vad.load_vad_model('cuda')
Modified: + vad_model = whisperx.vad.load_vad_model('mps')
vq_stoks.py
Original: - self.register_buffer('val_true', torch.zeros(1).cuda())
Modified: + self.register_buffer('val_true', torch.zeros(1).to("mps"))
Original: - self.register_buffer('val_total', torch.zeros(1).cuda())
Modified: + self.register_buffer('val_total', torch.zeros(1).to("mps"))
wh_transcribe.py
Original: - embs = whmodel.encoder(whisper.log_mel_spectrogram(samples).cuda())
Modified: + embs = whmodel.encoder(whisper.log_mel_spectrogram(samples).to("mps"))
oh yeah totally i dont mind and since i did this as a hack just to see if it runs in the end or not i was not making it nice to be a MR, so you are welcome to use any of it for the good of community
@signalprime
Just FYI, MLX currently shows some regression on M3 chips, but that's likely due to how new it is and it's constantly improving so I would be shocked if it's not better than MPS on any and all silicone in the very near future. Also, it requires a model to be converted to MLX so… If we were to implement it, here's some info will see how this exciting technology develops.
I marked the pull request at ready for review: https://github.com/collabora/WhisperSpeech/pull/77
If one or two people who tested mps previously could test the pull request that might help out. @jpc speed review as well.
Also, I'm open to learning jupyter notebooks as @jpc offered, but please don't make me redo this pull request using them...will try to use them in the future if it helps people on here. ;-)
I'm just seeing this, seems like it's been pushed already. I'll keep my eyes open in case I can help out
Since #89 (successor to #77) is merged, I'm going to close this issue now
Has this been tested? I just tried it and I had to change some parts to get it to launch. https://github.com/collabora/WhisperSpeech/pull/92
whisperspeech.utils
path when importingwebdataset
into requirements since now the code is using it directly to run core logic https://github.com/collabora/WhisperSpeech/pull/92/files#diff-f247846b0ab6c196467cd7e8e41027c7f27eb2b74b2e9a33d54c59a6fe9f00b0R39 (otherwise it fails when installing and trying to run in an app).to()
method?Even after these changes, it fails with a RuntimeError: Placeholder storage has not been allocated on MPS device!
on an MPS device. Maybe I'm missing something.
For the record, I tested as if I would be using the package in a regular application:
whisperspeech
from pip, I Installed directly from the forked git repoHas anyone tested? Let me know if anyone got the current MAIN branch to work on their MPS machine.
To clarify, did you get it to work just like before the major pull request that I did (based on others' insights abut models/tensors) after the modifications in your pull request...or was there still something lacking? Unfortunately, I don't have MacOS to test things, which is why I asked for 2 testers...Glad someone did.
Do you have any log or print statements you can share, personal info redacted if you choose of course?
I never touched the codebase until today. I only checked back when I heard from someone that this is now implemented and merged, so I never got to try the code before. Even if I did, MPS wasn't working anyway so couldn't have tested.
Before the changes I made in my PR, the app that was using whisperspeech was failing whenever it tried to use the utils.py
module.
Also, I feel like this is not supposed to work (unless you run this in a dev environment) unless we move the webdataset
from dev_requirements
to requirements
https://github.com/collabora/WhisperSpeech/pull/92/files#diff-f247846b0ab6c196467cd7e8e41027c7f27eb2b74b2e9a33d54c59a6fe9f00b0R39 (It wasn't working, which is how I found out).
Here's the full error log (app.py
is the app that uses whisperspeech)
$ source /Users/x/pinokio/api/whisperspeech/env/bin/activate /Users/x/pinokio/api/whisperspeech/env && python app.py
/Users/x/pinokio/api/whisperspeech/env/lib/python3.10/site-packages/torch/nn/utils/weight_norm.py:28: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")
Running on local URL: http://127.0.0.1:7860
To create a public link, set `share=True` in `launch()`.
[' This is the first demo of Whisper Speech, a fully open source text-to-speech model trained by Collabora and Lion on the Juwels supercomputer. '] ['en']
/Users/x/pinokio/api/whisperspeech/env/lib/python3.10/site-packages/torch/backends/cuda/__init__.py:342: FutureWarning: torch.backends.cuda.sdp_kernel() is deprecated. In the future, this context manager will be removed. Please see, torch.nn.attention.sdpa_kernel() for the new context manager, with updated signature.
warnings.warn(
Traceback (most recent call last):████████████████████████████████████████████████████████████████████| 100.00% [752/752 00:58<00:00]
File "/Users/x/pinokio/api/whisperspeech/env/lib/python3.10/site-packages/gradio/queueing.py", line 495, in call_prediction
output = await route_utils.call_process_api(
File "/Users/x/pinokio/api/whisperspeech/env/lib/python3.10/site-packages/gradio/route_utils.py", line 232, in call_process_api
output = await app.get_blocks().process_api(
File "/Users/x/pinokio/api/whisperspeech/env/lib/python3.10/site-packages/gradio/blocks.py", line 1561, in process_api
result = await self.call_function(
File "/Users/x/pinokio/api/whisperspeech/env/lib/python3.10/site-packages/gradio/blocks.py", line 1179, in call_function
prediction = await anyio.to_thread.run_sync(
File "/Users/x/pinokio/api/whisperspeech/env/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "/Users/x/pinokio/api/whisperspeech/env/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2134, in run_sync_in_worker_thread
return await future
File "/Users/x/pinokio/api/whisperspeech/env/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 851, in run
result = context.run(func, *args)
File "/Users/x/pinokio/api/whisperspeech/env/lib/python3.10/site-packages/gradio/utils.py", line 678, in wrapper
response = f(*args, **kwargs)
File "/Users/x/pinokio/api/whisperspeech/app.py", line 47, in whisper_speech_demo
audio = generate_audio(pipe, segments, speaker_audio, speaker_url, cps)
File "/Users/x/pinokio/api/whisperspeech/app.py", line 38, in generate_audio
audio = pipe.vocoder.decode(atoks)
File "/Users/x/pinokio/api/whisperspeech/env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/Users/x/pinokio/api/whisperspeech/env/lib/python3.10/site-packages/whisperspeech/a2wav.py", line 42, in decode
return self.vocos.decode(features, bandwidth_id=bandwidth_id)
File "/Users/x/pinokio/api/whisperspeech/env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/Users/x/pinokio/api/whisperspeech/env/lib/python3.10/site-packages/vocos/pretrained.py", line 112, in decode
x = self.backbone(features_input, **kwargs)
File "/Users/x/pinokio/api/whisperspeech/env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1529, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/Users/x/pinokio/api/whisperspeech/env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1538, in _call_impl
return forward_call(*args, **kwargs)
File "/Users/x/pinokio/api/whisperspeech/env/lib/python3.10/site-packages/vocos/models.py", line 82, in forward
x = self.norm(x.transpose(1, 2), cond_embedding_id=bandwidth_id)
File "/Users/x/pinokio/api/whisperspeech/env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1529, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/Users/x/pinokio/api/whisperspeech/env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1538, in _call_impl
return forward_call(*args, **kwargs)
File "/Users/x/pinokio/api/whisperspeech/env/lib/python3.10/site-packages/vocos/modules.py", line 82, in forward
scale = self.scale(cond_embedding_id)
File "/Users/x/pinokio/api/whisperspeech/env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1529, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/Users/x/pinokio/api/whisperspeech/env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1538, in _call_impl
return forward_call(*args, **kwargs)
File "/Users/x/pinokio/api/whisperspeech/env/lib/python3.10/site-packages/torch/nn/modules/sparse.py", line 163, in forward
return F.embedding(
File "/Users/x/pinokio/api/whisperspeech/env/lib/python3.10/site-packages/torch/nn/functional.py", line 2264, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: Placeholder storage has not been allocated on MPS device!
Thanks, I'll try to take a look and do some research this weekend, especially because this was primarily my pull request. In my own defense, however, I was basing it off of what other people said about MPS support certain devices...and additionally I don't have an Apple computer so it's very difficult to troubleshoot since I can't test at all.
Do me a favor, run the prior code base before the recent pull request and let me know if you get any errors with it. Again, I don't have MacOS, but if I recall it's supposed to fallback to using cpu for everything.
Can you also verify which version of PyTorch and other libraries you have pip installed? I'll do my best. Thanks!
@BBC-Esq wouldn't prior code before your PR NOT have MPS support therefore won't run on my Mac?
I've tried both the default Mac installation and the nightly one. Hope this helps.
pip3 install --pre torch torchvision torchaudio
pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cpu
if I recall it's supposed to fallback to using cpu for everything.
Wait, does this mean this is technically a "CPU support" and not "MPS support"? I did see mentions of MPS in the new code so I just assumed this made use of MPS. Doesn't it?
You are correct. I thought I had edited my message before clicking "Comment"...guess I didn't. Yes, the pre-pull-request version did not support MPS (or attempt to support MPS). You'd have to doublecheck the code to see if it even supported "cpu" if "cuda" wasn't available because I only ever used "cuda," but I don't think that even cpu was used as a "fallback." Sorry for the confusion.
Which version of python are you running? I will send you the link to the pytorch wheels to pip install
just to make sure it's using the right one. I personally recently had issues with pytorch's index of wheels not giving the right one so they're on my naughty list...
Here is an example, but I'll find the correct wheels for torch, torchaudio, and torchvision all for you:
Wait, does this mean this is technically a "CPU support" and not "MPS support"? I did see mentions of MPS in the new code so I just assumed this made use of MPS. Doesn't it?
"mps" and "cpu" are different devices in pytorch terminology, and the pull request is supposed to support both compute devices based on a user's setup. If a user has "mps," it should use mps for everything except "vocoder" and "encoder," which should be apparent by looking at the pull request. If a user doesn't have "cuda" or "mps" everything should be placed on "cpu." Sorry for the mundane explanation but hope that clarifies...
MPS is much faster than cpu, but far behind "cuda," but at least it's an improvement for MacOS users...
Also, I noticed that WhisperSpeech automatically installs the three pytorch libraries through one of its dependencies "Speechbrain", which, on my system (again I use CUDA), it installed the CPU version. I had to manually uninstall torch
, torchvision
, and torchaudio
and run the proper pip install commands.
In theory, pip
should overwrite the older version when you explicitly pip install another version, but after I get you the specific wheels please pip uninstall those three libraries first...then pip install the 3 wheels I give you. Then I can help you as much as I can...At least we will have a baseline for any troubleshooting steps involving print statements and the like...
Thanks again!
Also, I noticed that WhisperSpeech automatically installs the three pytorch libraries through one of its dependencies "Speechbrain", which, on my system (again I use CUDA), it installed the CPU version. I had to manually uninstall torch, torchvision, and torchaudio and run the proper pip install commands.
Yes I do exactly that, and have confirmed the correct versions are installed. Also I install into venv
so everything is isolated in the venv. The venv is python 3.10.
I know I might be missing something since I didn't read through all the code, but just based on the error message doesn't it look like it's not from the torch install but from the code not explicitly applying MPS somewhere? I am very familiar with MPS errors when they fail to run because of the torch version mismatch, but I've never seen this kind of error coming from torch version mismatch.
Also, one important clarification. Can you take a look at this line https://github.com/collabora/WhisperSpeech/pull/89/files#diff-ba9e2bb34cdd77f3f053d7980195bbddbbb742c3c9311ccc5427ccaaf4fc785aR72 and let me know if this change is wrong? Because I am running this code base right now (without this change it won't even run saying Vocoder doesn't have .to()
method). I am going to assume you had a reason to add that to the code, and since I am operating with code that doesn't have that, maybe that's causing the problem.
I will look at that specific ".to" issue next, but for my own sanity can you please try pip uninstalling all three, and then pip installing these three wheels first? That way I can rule it out in my own mind at least...helps me...
pip install https://download.pytorch.org/whl/cpu/torch-2.1.2-cp310-none-macosx_10_9_x86_64.whl#sha256=d9b535cad0df3d13997dbe8bd68ac33e0e3ae5377639c9881948e40794a61403
pip install https://download.pytorch.org/whl/cpu/torchaudio-2.1.2-cp310-cp310-macosx_10_13_x86_64.whl#sha256=06f8c02814e6cdd78626bbf44ad2bb8afa5b39ab650c6af18328a32311461058
pip install https://download.pytorch.org/whl/cpu/torchvision-0.16.2-cp310-cp310-macosx_10_13_x86_64.whl#sha256=bc86f2800cb2c0c1a09c581409cdd6bff66e62f103dc83fc63f73346264c3756
Also, we'll be testing pytorch 2.1.2 not the latest 2.2.2, i.e. the version that I've tested my system on and I believe another macos user has...
Maybe @akorzh can chime in, but if he doesn't, he reportedly got it working on MPS with the things that needed cpu, moved to cpu instead of mps... And the goal of my pull request was to merely reflect the changes he made in his modification, the only difference being to "dynamically" choose the appropriate device based on a user's system so...I feel we're close.
hey team, I manually applied the changes and confirmed it works, however not all operations are performed on the "mps" device due to a lack of compatibility with torch. I've tested it with torch version '2.1.1'.
I will review the PR in discussion and edit this message afterwards. EDIT: there were numerous small updates required, therefore I created a new PR with the updates along with the appropriate credits to @akorzh and @BBC-Esq for their valuable contributions
@jpc I think it's save to close this now? If we decide to revise the MacOS issue later I'm assuming we can open another issue regarding MLX or Vulkan or what not...
Hi Do you know if CPU and MPS support is on the roadmap? Thanks!