I have some questions about the code and need you to help me.
When setting algo_config.predict_epsilon=True, the model is output noise directly. However, when computing the reconstruct loss self.loss_fn(x_recon_selected, x_start_selected) in the diffuser.py, the ground truth is x_start_selected, which is the ground truth trajectory, not the initial noise? What is the reason behind that?
In the function predict_start_from_noise, what's the formula that can compute the clean trajectory directly from the predicted noise? And also sqrt_recipm1_alphas_cumprod will get nans in the formula sqrt_recipm1_alphas_cumprod = torch.sqrt(1. / (alphas_cumprod - 1.)), alphas_cumprod is always less than 1 because it's the comproduct of numbers less than 1. Do you have the problem?
Why can the diffusion model predict the clean trajectory when setting algo_config.predict_epsilon=False?
Thanks for such awesome work.
I have some questions about the code and need you to help me.
When setting
algo_config.predict_epsilon=True
, the model is output noise directly. However, when computing the reconstruct lossself.loss_fn(x_recon_selected, x_start_selected)
in thediffuser.py
, the ground truth isx_start_selected
, which is the ground truth trajectory, not the initial noise? What is the reason behind that?In the function
predict_start_from_noise
, what's the formula that can compute the clean trajectory directly from the predicted noise? And alsosqrt_recipm1_alphas_cumprod
will get nans in the formulasqrt_recipm1_alphas_cumprod = torch.sqrt(1. / (alphas_cumprod - 1.))
,alphas_cumprod
is always less than 1 because it's the comproduct of numbers less than 1. Do you have the problem?Why can the diffusion model predict the clean trajectory when setting
algo_config.predict_epsilon=False
?Thanks for your response.