Open lsabrinax opened 1 year ago
I believe Imagen uses just convnets for its unet, not a transformer like stable diffusion does. So in that respect, it can't be used like I use it for stable diffusion here. However, if the underlying network has self attention modules or uses a transformer in some way, then it's possible to use it. Unsure how (or if) that would apply to Imagen, though.
Thanks for your reply, I'll try it on Imagen later. And I try it on stable-diffusion first and run it on A30 GPU, when I set ratio=0.5, the time cost was 1.4->0.939(1.5x), and gpu memory was 17648MB->15576MB, the improvement is not as good as reported in Readme, and when I set ratio=0.6, the cost time and GPU memory are greater than ratio=0.5. It could be what reasons? How can I reproduce the result
and when I set ratio=0.6, the cost time and GPU memory are greater than ratio=0.5
That doesn't seem right. What environment are you in and how are you benchmarking this?
I rerun the following code on V100 GPU to evaluate the performance, torcch version is 0.12.1 ,image size is 512* 512
import torch, tomesd
from diffusers import StableDiffusionPipeline
import time
pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16).to("cuda")
# Apply ToMe with a 50% merging ratio
tomesd.apply_patch(pipe, ratio=0.5) # Can also use pipe.unet in place of pipe here
infer_time, count = 0.0, 0.0
for i in range(200):
start = time.time()
image = pipe("a photo of an astronaut riding a horse on mars").images[0]
infer_time += time.time() - start
count += 1
image.save("astronaut.png")
print(f'average time: {infer_time / count}')
w/o tomesd: gpu memory is 6040MB and average time is 4.055s; w/ tomesd and ratio=0.5, the gpu memory is 5216MB and average time is 3.5749s, it is not speed up obviously as reported in table in Readme
Thanks for your nice work! I want to know whether tomesd can only support stable diffusion model, can it support other diffusion model like as imagen