lucidrains / denoising-diffusion-pytorch

Implementation of Denoising Diffusion Probabilistic Model in Pytorch
MIT License
8.04k stars 1k forks source link

How to train text-guided diffusion? #221

Closed HerocatUED closed 1 year ago

Adversarian commented 1 year ago

I don't think this repository supports text-to-image. You would need to add a text encoder component that injects information from your prompt into your backward diffusion process and use a contrastive loss to measure the similarity of the generated images with the respective prompts. I can refer you to this(CLIP) and this(GLIDE) but I believe HuggingFace already provides a training facility for your desired task.

HerocatUED commented 1 year ago

I see, thank you!