Add initial support for Metal Performance Shaders backend.

secYOUre commented 6 months ago

It works generally well on Apple Silicon. However there are pytorch operators, such as 'aten::_weight_norm_interface', which are not currently implemented for the MPS device. As a temporary fix, you can set the environment variable PYTORCH_ENABLE_MPS_FALLBACK=1 to use the CPU as a fallback for this op.

Similarly, when the XTTS engine is selected, MPS does not compose nicely with deepspeed, because features such as redirects (i.e., torch.distributed.elastic.multiprocessing.redirects) are not implemented on the CPU backend for Windows and MacOs at the time of writing. Run alongside --no-deepspeed option at the command line, where needed.

Everything else stays the same.

Committer: Alfonso De Gregorio adg@secYOUre.com

aedocw commented 6 months ago

This is really interesting, thanks for making this PR. I have only had a few minutes to play with it, still need to figure out exactly which scenarios require PYTORCH_ENABLE_MPS_FALLBACK=1 to work. Also it should probably disable deepspeed if using MPS (maybe line 391 becomes if self.no_deepspeed or self.device == "mps":)

Is there any way to set PYTORCH_ENABLE_MPS_FALLBACK=1 proactively, without the user having to know they need to do that? Otherwise this could break folks who are currently using apple silicon without their having to know to set the env var.

I'm definitely interested in figuring out how to merge this work though, thanks again!

secYOUre commented 6 months ago

Thanks Christopher for epub2tts! Nowadays the quality of the TTS is so great, that I am using epub2tts to go through the long tail of publications in my bookshelf for which an audiobook edition was never released. Sounds amazing, literally!

As per setting the environment in a proactive way: absolutely, after importing the os module with 'import os', something like os.environ['PYTORCH_ENABLE_MPS_FALLBACK']='1' will do the job.

With regard the composability of PyTorch and Deepspeed and the resulting conditional code branches in epub2tts, today disabling deepspeed when running on the MPS backend is necessary; as you correctly said, this can be done in the code, saving the user from the complexities of invoking the command line in the correct way. However, looking ahead, I expect to see further progress by Deepspeed in this space. And, as a result of that, its composability with PyTorch will improve when run on backends that today are not fully supported. Hence it is probably worth monitoring the main epub2tts dependancies, so as to adjust how their code is called. A command line option, conversely, puts greater burden on the users but gets the script code ready for the day in which PyTorch and Deepspace will compose at a fuller extent even when on MPS.

If there is anything I can do to help you with this, please don't hesitate to let me know. Best!

aedocw commented 6 months ago

From my brief testing, I think the environment variable would need to be set before Python loads pytorch. It seems like setting it in the script doesn't work:

        elif torch.backends.mps.is_available():
            self.device = "mps"
            os.environ['PYTORCH_ENABLE_MPS_FALLBACK']='1'

More importantly though, have you seen any advantage to using MPS? I ran some relatively short tests and it was not any faster (M1 Macbook Pro) than running without.

I am definitely interested in adding support for MPS when it works fully and offers speed improvements. But so far it doesn't seem like it does - but if there is a use-case where it makes a difference (maybe it's faster over a longer time frame?) please let me know. Otherwise I think it makes sense to keep tracking torch/mps progress, and we add this as a feature when all needed operators are enabled.

aedocw / epub2tts

Add initial support for Metal Performance Shaders backend. #181