Closed dathudeptrai closed 1 year ago
cc @williamberman I think we already support this
@williamberman can you take a look, I do not think we already support this. I experimented with many kind of diffusions model and it seems true to me that Variational bound
loss contributed to the performance a lot. Here is new sota for several datasets (https://github.com/forever208/DDPM-IP/tree/DDPM-IP) which also built upon guided diffusion
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
hey @dathudeptrai I haven't read the paper in a while, could you remind me what it contains that we don't support? iirc the last time I read the codebase (not the paper), the two components were learned variance range and timestep skipping, both of which we support in the ddpm scheduler
@williamberman yeah, diffusers support learned variance but only for inferencing. For the training, we need support vb_loss here (https://github.com/openai/improved-diffusion/blob/e94489283bb876ac1477d5dd7709bbbd2d9902ce/improved_diffusion/gaussian_diffusion.py#L722-L728).
Oh, I gotcha yeah. I recently looked into adding the kl term to the loss function for our finetuning scripts for learned variance models but we opted to just train the predicted error and switch the finetuned model to a fixed variance schedule. There's not really a good place in diffusers scheduler source for the loss function. We code them directly into our training scripts when needed and we explicitly don't support right now
I think if you want to do some training with using the DDPM scheduler for noising, it should be feasible to implement the loss function in your own training script without having to make changes to diffusers source. Is that sufficient for you?
@williamberman yeah, that is what I am doing right now, adding the loss in the training script. I just think that adding the loss function inside the scheduler is a good design.
nice! Unfortunately I think it's something we're ok with not supporting for now just for maintenance reasons :)
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Is there any plan to add the vlb loss for training in the Noise Scheduler classes?
@williamberman yeah, that is what I am doing right now, adding the loss in the training script. I just think that adding the loss function inside the scheduler is a good design.
Please how to write vb loss in diffusers training script, thanks !!!
What about the cosine noise schedule ? so far there is these ones in the code but i don't think it's exactly the one in the paper and it seemed to be important in getting perf improv
if trained_betas is not None:
self.betas = torch.tensor(trained_betas, dtype=torch.float32)
elif beta_schedule == "linear":
self.betas = torch.linspace(beta_start, beta_end, num_train_timesteps, dtype=torch.float32)
elif beta_schedule == "scaled_linear":
# this schedule is very specific to the latent diffusion model.
self.betas = torch.linspace(beta_start**0.5, beta_end**0.5, num_train_timesteps, dtype=torch.float32) ** 2
elif beta_schedule == "squaredcos_cap_v2":
# Glide cosine schedule
self.betas = betas_for_alpha_bar(num_train_timesteps)
elif beta_schedule == "sigmoid":
# GeoDiff sigmoid schedule
betas = torch.linspace(-6, 6, num_train_timesteps)
self.betas = torch.sigmoid(betas) * (beta_end - beta_start) + beta_start
Is your feature request related to a problem? Please describe.
Reduce the steps needed in the inference process is important for the practical deployment.
Describe the solution you'd like We should support Variational bound loss when
variance_type=learned_range
.Additional context The original code here