nerdyrodent / VQGAN-CLIP

Just playing with getting VQGAN+CLIP running locally, rather than having to use colab.
Other
2.59k stars 428 forks source link

Metal Performance Shaders (MPS) Support #158

Open researcx opened 1 year ago

researcx commented 1 year ago

https://pytorch.org/docs/stable/notes/mps.html

#47702

Install the latest PyTorch with Metal Performance Shaders (MPS) support:

They're in stable, so you probably already have it.

Stable:

pip install torch torchvision torchaudio -f https://download.pytorch.org/whl/torch_stable.html

Nightly:

pip install --pre torch torchvision torchaudio -f https://download.pytorch.org/whl/nightly/cpu/torch_nightly.html

(Source: https://pytorch.org/tutorials/prototype/ios_gpu_workflow.html)

pip uninstall torch torchvision torchaudio if for some reason you need to remove them.

Verifying the existence of MPS support in PyTorch:

(vqgan) sysadmin@codekitty VQGAN-CLIP % python
Python 3.9.12 (main, Jun  1 2022, 06:34:44)
[Clang 12.0.0 ] :: Anaconda, Inc. on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.has_mps
True
>>> torch.backends.mps.is_available()
True
>>> torch.backends.mps.is_built()
True

In generate.py remove or comment out the following:

if not args.cuda_device == 'cpu' and not torch.cuda.is_available():
    args.cuda_device = 'cpu'
    args.video_fps = 0
    print("Warning: No GPU found! Using the CPU instead. The iterations will be slow.#")
    print("Perhaps CUDA/ROCm or the right pytorch version is not properly installed?")

(for testing purposes only)

For future reference, you can check for MPS availability with

if not torch.backends.mps.is_available():
    if not torch.backends.mps.is_built():
        print("MPS not available because the current PyTorch install was not "
              "built with MPS enabled.")
    else:
        print("MPS not available because the current MacOS version is not 12.3+ "
              "and/or you do not have an MPS-enabled device on this machine.")

Getting random.sh working on Mac:

The script errors out because shuf is missing, we can get shuf by installing coreutils.

brew install coreutils

Running:

python generate.py -p "A painting of an apple in a fruit bowl" -cd mps

Attempt 1:

Output:

(vqgan) sysadmin@codekitty VQGAN-CLIP % python generate.py --cuda_device mps -p "A painting of an apple in a fruit bowl"
Working with z of shape (1, 256, 16, 16) = 65536 dimensions.
loaded pretrained LPIPS loss from taming/modules/autoencoder/lpips/vgg.pth
VQLPIPSWithDiscriminator running with hinge loss.
Restored from checkpoints/vqgan_imagenet_f16_16384.ckpt
Traceback (most recent call last):
  File "/Users/sysadmin/Development/VQGAN-CLIP/generate.py", line 625, in <module>
    embed = perceptor.encode_text(clip.tokenize(txt).to(device)).float()
  File "/Users/sysadmin/Development/VQGAN-CLIP/CLIP/clip/model.py", line 354, in encode_text
    x = x[torch.arange(x.shape[0]), text.argmax(dim=-1)] @ self.text_projection
NotImplementedError: The operator 'aten::index.Tensor' is not current implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on https://github.com/pytorch/pytorch/issues/77764. As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.

Attempt 2:

Running with:

export PYTORCH_ENABLE_MPS_FALLBACK=1

Output:

(vqgan) sysadmin@codekitty VQGAN-CLIP % python generate.py --cuda_device mps -p "A painting of an apple in a fruit bowl"
Working with z of shape (1, 256, 16, 16) = 65536 dimensions.
loaded pretrained LPIPS loss from taming/modules/autoencoder/lpips/vgg.pth
VQLPIPSWithDiscriminator running with hinge loss.
Restored from checkpoints/vqgan_imagenet_f16_16384.ckpt
Using device: mps
Optimising using: Adam
Using text prompts: ['A painting of an apple in a fruit bowl']
Using seed: 4061991638698407707
0it [00:00, ?it/s]-:27:11: error: invalid input tensor shapes, indices shape and updates shape must be equal
-:27:11: note: see current operation: %25 = "mps.scatter_along_axis"(%23, %arg5, %24, %1) {mode = 6 : i32} : (tensor<3311616xf32>, tensor<224xf32>, tensor<1103872xi32>, tensor<i32>) -> tensor<3311616xf32>
/AppleInternal/Library/BuildRoots/20d6c351-ee94-11ec-bcaf-7247572f23b4/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphExecutable.mm:1267: failed assertion `Error: MLIR pass manager failed'
zsh: abort      python generate.py --cuda_device mps -p
(vqgan) sysadmin@codekitty VQGAN-CLIP % /Users/sysadmin/Development/miniconda3/envs/vqgan/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '