lucidrains / big-sleep

A simple command line tool for text to image generation, using OpenAI's CLIP and a BigGAN. Technique was originally created by https://twitter.com/advadnoun
MIT License
2.57k stars 304 forks source link

How did you train this? #90

Open KCGD opened 3 years ago

KCGD commented 3 years ago

Usually AIs train towards a tangible and absolute output but this does the complete opposite. How?

xnghu commented 3 years ago

i might not be completely right, still working to truly understand the inner workings-- but to my understanding it uses two pre-trained models, bigGAN and CLIP. CLIP has been trained to associate text and images, and bigGAN is trained to generate images. Putting them together you get:

text -> CLIP -> text encoding which associates to an image that fits the text well -> bigGAN, which attempts to make a "realistic" image from the encoding from CLIP.

probably someone else could explain better, but that's my understanding from an abstract level