Closed nitishalodia closed 1 year ago
Can you please provide a full stack trace as a snippet here? (preferably formatted with backticks so I can read it).
Also any other information about what function you called, etc.
Pinging this issue since its gone stale
$ PYTORCH_ENABLE_MPS_FALLBACK=1 python3 makeVid.py
Fetching 16 files: 100%|█████████████████████████████████████████████████████████████████████████████████████| 16/16 [00:00<00:00, 12091.69it/s]
Traceback (most recent call last):
File "/Users/caseker/Projects/Code/makeVid.py", line 10, in <module>
video_path = pipeline.walk(
File "/Users/caseker/Projects/Code/stable-diffusion-videos/stable_diffusion_videos/stable_diffusion_pipeline.py", line 840, in walk
self.make_clip_frames(
File "/Users/caseker/Projects/Code/stable-diffusion-videos/stable_diffusion_videos/stable_diffusion_pipeline.py", line 623, in make_clip_frames
for _, embeds_batch, noise_batch in batch_generator:
File "/Users/caseker/Projects/Code/stable-diffusion-videos/stable_diffusion_videos/stable_diffusion_pipeline.py", line 562, in generate_inputs
embeds = torch.lerp(embeds_a, embeds_b, t)
RuntimeError: "lerp_kernel_scalar" not implemented for 'Half'
The same thing when I don't use PYTORCH_ENABLE_MPS_FALLBACK=1.
same issue on latest version of code? pip install --upgrade stable-diffusion-videos
It still happens, although the error is a bit prettier now. I'm thinking torch on m1 is missing a few screws.
hmm, I wonder if #144 fixes this issue...
could try it yourself...
pip install git+https://github.com/seriousran/stable-diffusion-videos
That didn't seem to do anything, BUT I got it to work by using torch_dtype=torch.float32 in my test code -- so it's talking about half precision. That does make it hover around 13+ GB ram usage, but it works! I'm sure float16 will be implemented in torch mps soon enough. Also, I like this.
from stable_diffusion_videos import StableDiffusionWalkPipeline
import torch
pipeline = StableDiffusionWalkPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4",
torch_dtype=torch.float32,
revision="fp16",
).to("mps")
video_path = pipeline.walk(
prompts=['album cover with no title with colorful smoke and trippy visuals', 'album cover with no title with colorful smoke and trippy visuals'],
seeds=[737398, 13],
num_interpolation_steps=120,
height=512, # use multiples of 64 if > 512. Multiples of 8 if < 512.
width=512, # use multiples of 64 if > 512. Multiples of 8 if < 512.
output_dir='dreams', # Where images/videos will be saved
name='musicVid', # Subdirectory of output_dir where images/videos will be saved
guidance_scale=8.0, # Higher adheres to prompt more, lower lets model take the wheel
num_inference_steps=55, # Number of diffusion steps per image generated. 50 is good default
)
Ahh I didn't even think about that - it definitely won't work with half precision, I didn't realize you were using torch.float16
before. So this is resolved, eh? as I think this is the expected behavior on mps.
We should add some info to the readme about this probably (Will include when doing #146 )
It's as resolved as it can be, anyway! A real solution will have to come from Apple at this point, I think. It's alright though, I think adding it to the readme should be good.
Using this on mac m1 machine and getting this error - RuntimeError: "lerp_kernel_scalar" not implemented for 'Half'