Running with MPS as device (for apple M1 chips)

Hi!

I've been trying to set this up on a MacBook with an M1 chip. However, simply changing ".cuda()" -> .to(device) with device being set to "mps" does not work.

As far as I can understand, the issue boils down to the use of torch.cuda.amp (the gradient scaler and autocaster) in trainer.py. I tried setting enabled=false by hand, but that does not work either. My current knowledge is not enough to figure out how to modify the code to get around that. It does work if I set the device to "CPU" though.

Any pointers on how to do it?

Thanks for the hard work in this implementation, looking forward to making it work on my M1 MacBook.

lucidrains / DALLE2-pytorch

Running with MPS as device (for apple M1 chips) #136