zoubohao / DenoisingDiffusionProbabilityModel-ddpm-

This may be the simplest implement of DDPM. You can directly run Main.py to train the UNet on CIFAR-10 dataset and see the amazing process of denoising.
MIT License
1.43k stars 156 forks source link

Conditional embedding #30

Open yangming0724 opened 8 months ago

yangming0724 commented 8 months ago

When doing conditional generation, the CIFAR10 labels (index from 0 to 9) need to be embedding into a latent space with the same dimension as time embedding. When read your code, I noticed the following line: nn.Embedding(num_embeddings=num_labels + 1, embedding_dim=d_model, padding_idx=0) Why the num_embeddings set to be num_labels + 1, and the padding is for what kind of purpose?

Thanks in advance.

zoubohao commented 8 months ago

Please read the paper "CLASSIFIER-FREE DIFFUSION GUIDANCE". The label 0 refers to the model without conditions. The labels 1 to 10 refer to the CIFAR10 labels.

The paper says: "Instead of training a separate classifier model, we choose to train an unconditional denoising diffusion model pθ(z) parameterized through a score estimator θ(zλ) together with the conditional model pθ(z|c) parameterized through θ(zλ, c)."