Closed apolinario closed 1 year ago
Think the above error was because of a missing .to("cuda")
statement. Note that torch compile only works on CUDA.
But if I add a .to("cuda")
statement I get a new error:
getattr(self, inst.opname)(inst)
File "/home/patrick/hf/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 342, in wrapper
return inner_fn(self, inst)
File "/home/patrick/hf/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1014, in CALL_FUNCTION_KW
self.call_function(fn, args, kwargs)
File "/home/patrick/hf/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 474, in call_function
self.push(fn.call_function(self, args, kwargs))
File "/home/patrick/hf/lib/python3.10/site-packages/torch/_dynamo/variables/misc.py", line 744, in call_function
return self.obj.call_method(tx, self.name, args, kwargs).add_options(self)
File "/home/patrick/hf/lib/python3.10/site-packages/torch/_dynamo/variables/tensor.py", line 424, in call_method
return wrap_fx_proxy(
File "/home/patrick/hf/lib/python3.10/site-packages/torch/_dynamo/variables/builder.py", line 754, in wrap_fx_proxy
return wrap_fx_proxy_cls(
File "/home/patrick/hf/lib/python3.10/site-packages/torch/_dynamo/variables/builder.py", line 789, in wrap_fx_proxy_cls
example_value = get_fake_value(proxy.node, tx)
File "/home/patrick/hf/lib/python3.10/site-packages/torch/_dynamo/utils.py", line 1168, in get_fake_value
unimplemented(f"dynamic shape operator: {cause.func}")
File "/home/patrick/hf/lib/python3.10/site-packages/torch/_dynamo/exc.py", line 71, in unimplemented
raise Unsupported(msg)
torch._dynamo.exc.Unsupported: dynamic shape operator: aten.repeat_interleave.Tensor
from user code:
File "/home/patrick/python_bin/diffusers/models/unet_3d_condition.py", line 521, in forward
emb = emb.repeat_interleave(repeats=num_frames, dim=0)
Set torch._dynamo.config.verbose=True for more information
You can suppress this exception and fall back to eager by setting:
torch._dynamo.config.suppress_errors = True
which can be reproduced when running:
import torch
from diffusers import DiffusionPipeline
from diffusers.utils import export_to_video
from PIL import Image
pipe = DiffusionPipeline.from_pretrained("cerspense/zeroscope_v2_576w", torch_dtype=torch.float16)
pipe.to("cuda")
pipe.enable_vae_slicing()
pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True)
prompt = "Darth Vader is surfing on waves"
video_frames = pipe(prompt, num_inference_steps=40, height=320, width=576, num_frames=36).frames
video_path = export_to_video(video_frames, output_video_path="/home/patrick/videos/video_576_darth_vader_36.mp4")
I'm currently a bit busy with other things @sayakpaul do you have some time to look into it by any chance?
There seems to be an existing problem with repeat_interleave()
which might have been fixed in the nightlies. Currently trying that out.
@patrickvonplaten let's jam here: https://github.com/huggingface/diffusers/pull/3949.
Hi! I try to use the torch.compile for the model "damo-vilab/text-to-video-ms-1.7b" (https://huggingface.co/docs/diffusers/api/pipelines/text_to_video) but it takes very long time to generate. Testing on 3090, to gen a video with num_inference_steps=25, it took me about 10s without torch.compile but more than 100s if i use torch.compile. What is the possible issue and can you support to fix it? Thanks
import torch
from diffusers import DiffusionPipeline
from diffusers.utils import export_to_video
pipe = DiffusionPipeline.from_pretrained("damo-vilab/text-to-video-ms-1.7b", torch_dtype=torch.float16, variant="fp16")
**pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True)**
pipe.enable_model_cpu_offload()
pipe.enable_vae_slicing()
prompt = "Darth Vader surfing a wave"
video_frames = pipe(prompt, num_inference_steps=25).frames
video_path = export_to_video(video_frames)
video_path
Hi @hnnam0906! The network is compiled at the first inference. Don't count first inference, evaluate next inferences. See here.
@standardAI : Thank for your info.
Describe the bug
Trying to use
torch.compile
on a text-to-video model doesn't workIf I try to follow the docs and do a
pipe.unet.to(memory_format=torch.channels_last)
I get a
If I try to not use the
torch.channels_last
format and go directlyI get a
Reproduction
Logs
Full traceback for
pipe.unet.to(memory_format=torch.channels_last)
Full traceback for
pipe.unet.to(memory_format=torch.channels_last)
System Info
diffusers==0.17.1
Who can help?
@patrickvonplaten