crowsonkb / k-diffusion

Karras et al. (2022) diffusion models for PyTorch
MIT License
2.26k stars 372 forks source link

How to perform a forward process from x_0 to x_t #4

Closed UdonDa closed 2 years ago

UdonDa commented 2 years ago

Hi, @crowsonkb !

I confuse a forward process from x_0 to x_t. Would you teach me? I'd like to implement conditional augmentation in Imagen paper for a super-resolution. It perturbs x_0 and obtains x_t and t (in your implementation, I guess t means sigma).

I know that your implementation uses randomly sigma here and create a noisy x_t samples here. noised_input = c_in * (input + noise * utils.append_dims(sigma, input.ndim)) as shown in Eq. 7 Then, does noised_input means x_t images, which are created by a Karras' forward diffusion process, right?

crowsonkb commented 2 years ago

To get from x_0 to x_t you just do x_0 + noise * t (t is the same as sigma for the Karras forward process), where you draw noise using torch.randn(). Then to use the preconditioner, so that the input to the model is variance ~1, you multiply by c_in for that sigma. So the line you wrote is correct. :) Then you would feed in log(sigma) / 4 to another set of Fourier Features to condition the model on the noise level of the conditioning too.

UdonDa commented 2 years ago

Thanks!! I will try a coniditioning!

BrandonKoerner commented 2 years ago

Thank you so much for this guidance. I implemented this for img2img sampling for Stable Diffusion.