Open Wazhee opened 8 months ago
Different objective types ('grad', 'noise', 'ysubx') provide different guidance: 'grad': This objective incorporates information from both the encoded source and target images, along with noise and interpolation coefficients, to steer the sampling process in a way that considers both the starting point and the desired target. 'noise': This objective simply uses the noise itself, allowing for a more random exploration of the latent space. 'ysubx': This objective directly uses the difference between the encoded target and source images, guiding the model towards reducing that difference and reaching the source representation.
Different objective types ('grad', 'noise', 'ysubx') provide different guidance: 'grad': This objective incorporates information from both the encoded source and target images, along with noise and interpolation coefficients, to steer the sampling process in a way that considers both the starting point and the desired target. 'noise': This objective simply uses the noise itself, allowing for a more random exploration of the latent space. 'ysubx': This objective directly uses the difference between the encoded target and source images, guiding the model towards reducing that difference and reaching the source representation.
Thank you for the clear explanation! The default objective type is set to "grad" both in the code and the paper. Have you ever tried training and testing with "noise" or "ysubx"? I wonder what would be the different effects? If I want to achieve well-aligned image-to-image translation, would using "noise" or "ysubx" bring better results (for example, achieving higher PSNR and SSIM)?
I dont have that much compute to do the training now. the trained model is not in chrome/ secure site to download and finetune. but any way..
unlike other models adding noise to the previous image in BBDM is different (qBB(xt|x0,y)=N(xt;(1−mt)x0 +mty,δtI
) (model/BrownianBridge/BrownianBridgeModel.py#L128). it takes into account the initial state and final state to calculate noise and remove it. (more or less)
objective = m_t * (y - x0) + sigma_t * noise
this is the objective to remove noise based on both initial and target domain.
if you use objecive = noise the BBDM acts as a normal diffusion on latent vectors obtained from vqgan.
or as i just did the math get an intuition
go through the p_sample and q_sample funcion for better understanding
or as i just did the math get an intuition
go through the p_sample and q_sample funcion for better understanding
I gain much inspiration, thank you so much!
Great Work!
I was curious what the various options for the objective parameter mean? Im looking in BBDM.yaml file and see " objective: 'grad' # options {'grad', 'noise', 'ysubx'}" with no clue where they are referring to. Can you provide insight or references?