Here is how to run this on Mac with Apple Silicon

YAY-3M-TA3 commented 10 months ago

Here is how to run this on Mac with Apple Silicon

In a terminal window:

git clone https://github.com/open-mmlab/PIA.git PIA-cpu
cd PIA-cpu
conda create -n pia-cpu python=3.10.13
conda activate pia-cpu

Now install this special version of Pytorch, torchvision and torchaudio, to allow for conv3D on apple silicon:

pip install https://download.pytorch.org/whl/nightly/cpu/torch-2.3.0.dev20231216-cp310-none-macosx_11_0_arm64.whl
pip install https://download.pytorch.org/whl/nightly/cpu/torchvision-0.18.0.dev20231216-cp310-cp310-macosx_11_0_arm64.whl https://download.pytorch.org/whl/nightly/cpu/torchaudio-2.2.0.dev20231116-cp310-cp310-macosx_11_0_arm64.whl --no-deps

For the dependancies, create a new text file called requirements.txt and copy this into that text file:

ninja
opencv-python
diffusers==0.24.0
transformers==4.25.1
accelerate
moviepy
imageio==2.27.0
eva-decord
gdown
einops
omegaconf
safetensors
gradio
wandb

Now install requirements by typing: pip install -r requirements.txt

Next, open this link merges.txt in a browser and save the merges.txt file to PIA-cpu

Next, you need to do some code modification: in file, PIA-cpu/animatediff/pipelines/i2v_pipeline.py lines 270-273 - change to these:

device = torch.device('cpu')
unet_dtype = torch.float32
tenc_dtype = torch.float32
vae_dtype = torch.float32

Finally, you need to make a change to a module. to find the module, type: conda env list Then look for the path with pia-cpu pia-cpu /Users/name/miniforge3/envs/pia-cpu

the file to modify is in /lib/python3.10/site-packages/transformers/models/clip/tokenization_clip.py

add line 303 to tokenization_clip.py and save: merges_file = "merges.txt"

now to run PIA gradio demo, go to the PIA-cpu folder, run: python app.py

NOTE: This will run on MAC CPU only (Lighthouse example will take 42 minutes on a Mac M2 with 24G)

I tried to move everything to MPS, but it only rendered a black video... I dont know why....

LeoXing1996 commented 10 months ago

Hey @YAY-3M-TA3, when performing inference with mps, what precision did you use? torch.float16 or torch.float32?

YAY-3M-TA3 commented 10 months ago

Hey @YAY-3M-TA3, when performing inference with mps, what precision did you use? torch.float16 or torch.float32?

To fit in 24g, I used torch.float16. I was finally able to solve the black screen video by upcasting the key to float32 in the motion_module attention function.

So, now I have everything running on MPS.

I can get images from 480x480 and below processing in 7-17 secs per frame (25 steps 16 frame animation)... Animations can process in 2 to 8 minutes.

My problem now, I can't process 512x512 images as fast(They take as long as CPU processing because it doesn't fit in memory - so caches...) (Ideally I want this size since the model was trained on that.)

So, now I'm looking for things I can optimize, memory-wise...

LeoXing1996 commented 10 months ago

Refers to https://huggingface.co/docs/diffusers/optimization/mps, maybe you can use attention_slicing in your inference.

scottonly2 commented 5 months ago

how do you solve the dependency of decord ?

open-mmlab / PIA

Here is how to run this on Mac with Apple Silicon #21