huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
https://huggingface.co/docs/diffusers
Apache License 2.0
25.99k stars 5.35k forks source link

Support OpenAI improved DDPM #3287

Closed dathudeptrai closed 1 year ago

dathudeptrai commented 1 year ago

Is your feature request related to a problem? Please describe.

Reduce the steps needed in the inference process is important for the practical deployment.

Describe the solution you'd like We should support Variational bound loss when variance_type=learned_range.

Additional context The original code here

patrickvonplaten commented 1 year ago

cc @williamberman I think we already support this

dathudeptrai commented 1 year ago

@williamberman can you take a look, I do not think we already support this. I experimented with many kind of diffusions model and it seems true to me that Variational bound loss contributed to the performance a lot. Here is new sota for several datasets (https://github.com/forever208/DDPM-IP/tree/DDPM-IP) which also built upon guided diffusion

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

williamberman commented 1 year ago

hey @dathudeptrai I haven't read the paper in a while, could you remind me what it contains that we don't support? iirc the last time I read the codebase (not the paper), the two components were learned variance range and timestep skipping, both of which we support in the ddpm scheduler

dathudeptrai commented 1 year ago

@williamberman yeah, diffusers support learned variance but only for inferencing. For the training, we need support vb_loss here (https://github.com/openai/improved-diffusion/blob/e94489283bb876ac1477d5dd7709bbbd2d9902ce/improved_diffusion/gaussian_diffusion.py#L722-L728).

williamberman commented 1 year ago

Oh, I gotcha yeah. I recently looked into adding the kl term to the loss function for our finetuning scripts for learned variance models but we opted to just train the predicted error and switch the finetuned model to a fixed variance schedule. There's not really a good place in diffusers scheduler source for the loss function. We code them directly into our training scripts when needed and we explicitly don't support right now

https://github.com/huggingface/diffusers/blob/c6ae8837512d0572639b9f57491d4482fdc8948c/examples/dreambooth/train_dreambooth.py#L1287-L1298

I think if you want to do some training with using the DDPM scheduler for noising, it should be feasible to implement the loss function in your own training script without having to make changes to diffusers source. Is that sufficient for you?

dathudeptrai commented 1 year ago

@williamberman yeah, that is what I am doing right now, adding the loss in the training script. I just think that adding the loss function inside the scheduler is a good design.

williamberman commented 1 year ago

nice! Unfortunately I think it's something we're ok with not supporting for now just for maintenance reasons :)

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

hyang0511 commented 10 months ago

Is there any plan to add the vlb loss for training in the Noise Scheduler classes?

jiangyuhangcn commented 6 months ago

@williamberman yeah, that is what I am doing right now, adding the loss in the training script. I just think that adding the loss function inside the scheduler is a good design.

Please how to write vb loss in diffusers training script, thanks !!!

arminvburren commented 5 months ago

What about the cosine noise schedule ? so far there is these ones in the code but i don't think it's exactly the one in the paper and it seemed to be important in getting perf improv

    if trained_betas is not None:
        self.betas = torch.tensor(trained_betas, dtype=torch.float32)
    elif beta_schedule == "linear":
        self.betas = torch.linspace(beta_start, beta_end, num_train_timesteps, dtype=torch.float32)
    elif beta_schedule == "scaled_linear":
        # this schedule is very specific to the latent diffusion model.
        self.betas = torch.linspace(beta_start**0.5, beta_end**0.5, num_train_timesteps, dtype=torch.float32) ** 2
    elif beta_schedule == "squaredcos_cap_v2":
        # Glide cosine schedule
        self.betas = betas_for_alpha_bar(num_train_timesteps)
    elif beta_schedule == "sigmoid":
        # GeoDiff sigmoid schedule
        betas = torch.linspace(-6, 6, num_train_timesteps)
        self.betas = torch.sigmoid(betas) * (beta_end - beta_start) + beta_start