phizaz / diffae

Official implementation of Diffusion Autoencoders
https://diff-ae.github.io/
MIT License
862 stars 130 forks source link

Reconstruct image with only z_{sem}, with x_T is sampled from N (0, I) #36

Open wyh2000 opened 1 year ago

wyh2000 commented 1 year ago

Hi, thanks for sharing this nice work.

Could you share some example code for how to reconstruct images by DiffAE when only z_{sem} is encoded from original images but x_T is sampled from N (0, I) for decoding?

It's probably just a small change to the autoencoding.ipynb, but I met some problems when I try to do it.

Thanks a lot.

lucasrelic99 commented 1 year ago

You can simply add the following line before the call to model.render() in the Decode section: sampled_xT = torch.normal(0,1,size=xT.shape, device=device)

Then, when rendering the image use: pred = model.render(sampled_xT, cond, T=20) instead of the encoded xT

For clarity, the entire code block should be:

sampled_xT = torch.normal(0,1,size=xT.shape, device=device)
pred = model.render(sampled_xT, cond, T=20)
fig, ax = plt.subplots(1, 2, figsize=(10, 5))
ori = (batch + 1) / 2
ax[0].imshow(ori[0].permute(1, 2, 0).cpu())
ax[1].imshow(pred[0].permute(1, 2, 0).cpu())
mapengsen commented 1 year ago

i have a some stupid question about x_T is sampled from N (0, I) for decoding, could the x_T is sampled from other ? for example: (0, I0) (5, 17) ....