Difference between DALL-E and a clip-steered GAN e.g. big-sleep

lucidrains / big-sleep

A simple command line tool for text to image generation, using OpenAI's CLIP and a BigGAN. Technique was originally created by https://twitter.com/advadnoun

MIT License

2.57k stars 306 forks source link

Difference between DALL-E and a clip-steered GAN e.g. big-sleep #22

Open CDitzel opened 3 years ago

CDitzel commented 3 years ago

Can someone explain to me the difference between both approaches?

They both are generating image content given a text input do they not?

And what is then the difference between big sleep and deep daze for that matter?

DrJKL commented 3 years ago

Big Sleep is BigGAN Deep Daze is SIREN This comment talks about the differences in what they generate: https://www.reddit.com/r/MachineLearning/comments/kzr4mg/p_the_big_sleep_texttoimage_generation_using/gl9y5rq?utm_source=share&utm_medium=web2x&context=3

SIREN being more dreamlike has also been my experience.

DALL-E is out of reach for most people because it's dependent on GPT-3. But I want it so bad.

mroosen commented 3 years ago

DALL-E is out of reach for most people because it's dependent on GPT-3. But I want it so bad.

Check out https://github.com/lucidrains/DALLE-pytorch if you haven't seen it yet.