madebyollin / taesd

Tiny AutoEncoder for Stable Diffusion
MIT License
545 stars 27 forks source link

does vae share latent space with each other? #13

Closed Xiang-cd closed 7 months ago

Xiang-cd commented 7 months ago

I found taesd vae decoder could not decode latents with sd vae encoded, sd vae decoder could not decode the latents encoded with taesd encoder. why?

madebyollin commented 7 months ago

Yeah, the SD-VAE and TAESD latent spaces are compatible. You just need to make sure that the latents / images are scaled appropriately. Here's some example code using diffusers:

!wget -q -nc "https://upload.wikimedia.org/wikipedia/commons/9/9c/Crepe_with_LaFrance_and_strawberries_and_fresh_cream_in_it.jpg" -O sample_image.jpg

from diffusers import AutoencoderTiny, AutoencoderKL
import torchvision.transforms.functional as TF
import torch as th
from PIL import Image

sdvae = AutoencoderKL.from_pretrained("stabilityai/sd-vae-ft-ema").half().eval().requires_grad_(False).cuda()
taesd = AutoencoderTiny.from_pretrained("madebyollin/taesd").half().eval().requires_grad_(False).cuda()

im = TF.center_crop(TF.resize(Image.open("sample_image.jpg").convert("RGB"), 512), 512)
im_cuda = TF.to_tensor(im)[None].mul(2).sub(1).cuda().half()

def show(*args):
    args = [a[0] if a.shape[0] == 1 else a for a in args]
    display(TF.to_pil_image(th.cat(args, -1).mul(0.5).add(0.5).clamp(0, 1)))

latents_sdvae = sdvae.encode(im_cuda).latent_dist.sample().mul(sdvae.config.scaling_factor)
latents_taesd = taesd.encode(im_cuda).latents
print("Encoded Latents (SDVAE, TAESD)")
show(latents_sdvae, latents_taesd)
print("Decoded SDVAE Latents (SDVAE->SDVAE, SDVAE->TAESD)")
dec_sdvae = sdvae.decode(latents_sdvae.div(sdvae.config.scaling_factor)).sample
dec_taesd = taesd.decode(latents_sdvae).sample
show(dec_sdvae, dec_taesd)
print("Decoded TAESD Latents (TAESD->SDVAE, TAESD->TAESD)")
dec_sdvae = sdvae.decode(latents_taesd.div(sdvae.config.scaling_factor)).sample
dec_taesd = taesd.decode(latents_taesd).sample
show(dec_sdvae, dec_taesd)
image
Xiang-cd commented 7 months ago

thanks, i think it was bugs of my implement or version of sd-vae differs