Open murrellb opened 1 year ago
I made a PR #6 for the MNIST example.
"Self-conditioning" feeds the predictions from the previous timestep to the model at the current timestep (https://arxiv.org/pdf/2208.04202.pdf). This can sometimes make a big practical difference. My intuitive understanding is that this gives the model a snapshot of where all the variables are heading to, and helps the model handle conditional dependencies between variables. This needs a modification during training (see eg. https://github.com/lucidrains/denoising-diffusion-pytorch/issues/94), which can be handled by the user, but it also needs a slightly different flow during the reverse diffusion, which we'll need to implement.
To achieve the self-conditioned sampling, we don't need to modify our code if we use a closure trick that looks like this:
function selfconditioned(x)
x_0 = zero(x)
function (x_t, t)
x_0 = net(x_t, x_0, t)
return x_0
end
end
x = randn(10)
samplebackward(selfconditioned(x), process, timesteps, x)
We need to name our processes consistently. Since this is Diffusions.jl
, I suggest:
OrnsteinUhlenbeckDiffusion
RotationDiffusion
WrappedBrownianDiffusion
UniformDiscreteDiffusion
IndependentDiscreteDiffusion
etc.