dbolya / tomesd

Speed up Stable Diffusion with this one simple trick!
MIT License
1.29k stars 78 forks source link

support for SDXL #50

Open wzq728 opened 1 year ago

wzq728 commented 1 year ago

Thanks for your nice work! I want to know if tome support SDXL? And if it is, how to use it.

dbolya commented 1 year ago

I haven't looked into it. How does SDXL differ from normal SD? If it's similar, there's probably a way to get it to work.

theAdamColton commented 1 year ago

I haven't done any detailed tests, but wrapping a huggingface pipeline.unet seems to work without crashing for training and inference, and produces images that are ok A bustling Parisian café scene in the 1920s  Jazz musicians, flapper girls, and intellectuals in conversation  Oil painting, canvas and oil paints  Warm, dimly lit ambiance

this is with r=0.5 at 672x672

dbolya commented 1 year ago

Does it speed it up? I think the default behavior of the diffusers implementation is to do nothing when wrapping the wrong thing, so it might not actually be doing anything.

theAdamColton commented 1 year ago
import tomesd
from diffusers import StableDiffusionXLPipeline, StableDiffusionPipeline
import torch
import time

pipeline = StableDiffusionXLPipeline.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16, variant="fp16").to("cuda")

batch_size = 4
resolution = 896
trials = 2

tt = 0
for _ in range(trials):
    st = time.time()
    pipeline(prompt="Laundromat Stories: Inside a laundromat on a rainy day. People load clothes into washing machines and read magazines while waiting. Charcoal drawing, chiaroscuro, dramatic
 lighting from overhead fluorescents.", num_inference_steps=20, num_images_per_prompt=batch_size, width = resolution, height=resolution)
    tt += time.time() - st
print("SDXL no tomesd: avg time", tt/trials)

pipeline = tomesd.apply_patch(pipeline, ratio=0.75, max_downsample = 4)

tt = 0
for _ in range(trials):
    st = time.time()
    pipeline(prompt="Laundromat Stories: Inside a laundromat on a rainy day. People load clothes into washing machines and read magazines while waiting. Charcoal drawing, chiaroscuro, dramatic
 lighting from overhead fluorescents.", num_inference_steps=20, num_images_per_prompt=batch_size, width = resolution, height=resolution)
    tt += time.time() - st
print("SDXL w/ tomesd: avg time", tt/trials)

I get around a 12% speedup on a 3090: 18.9267s vs 16.891s