Open raymond-yuan opened 10 months ago
Please give the specific code location
Please give the specific code location
In the GaussianDiffusion's method "training_losses_s2s" one can notice that in the 1622th LoC (gaussian_diffusion.py) predict_x_start
is introduced (and this is the only one occurrence of that hyperparameter in the whole codebase of the AR-Diffusion):
if self.config.predict_x_start:
x_output = model_out_x_start
else:
x_output = x_start
The predict_xstart
is referenced plenty of times in the codebase and its behaviour is, in fact, the same as setting the model_mean_type to ModelMeanType.START_X - that is, AR-Diffusion outputs $x_0$ instead of noise. However, what is more confusing, I found the comment that suggested the AR-Diffusion usually models a noise added in the forward diffusion process instead (in the DDIM sampling-related code).
# Usually our model outputs epsilon, but we re-derive it
# in case we used x_start or x_prev prediction.
This is just an attempt we made. Here we still follow the Diffusion-LM approach, that is, x_output = x_start. That is, the corresponding predict_x_start=False in config.yaml. There is indeed some confusion here with predict_xstart, but you don’t need to worry about predict_x_start. This is code that is convenient for debugging.
Are these supposed to be the same variable? They control different things in the code. Just wanted to clarify behavior here.
In the default configs, one is set to True, and the other is set to False which leads to very confusing behavior.