borisdayma / dalle-mini

DALL·E Mini - Generate images from a text prompt
https://www.craiyon.com
Apache License 2.0
14.75k stars 1.21k forks source link

Any plans to share a larger DALL-E 2 model? #208

Open auwsom opened 2 years ago

auwsom commented 2 years ago

Hi Boris, thank you so much for this test app. It is very awesome!

I'm wondering what the technical challenges are to sharing a larger model?

I've read the full model would cost around $300k, so maybe Eleuther.ai would be interested in assisting?

auwsom commented 2 years ago

I found this repo.. I dont know how it compares yet: https://github.com/saharmor/dalle-playground "DALL-E Mega is substianlly more capable than DALL-E Mini and therefore generates higher fidelity images. If you have the computing power--either through a Google Colab Pro+ subscription or by having a strong local machine, uncomment this line before running the backend." # DALLE_MODEL = 'dalle-mini/dalle-mini/mega-1:latest'

This seems to be using dalle-mini still, just with the mega-1 model instead of mega1fp16 (floating point 16 vs fp32?) model. This post seems to support that. And that adjusting the dtype is also necessary. https://huggingface.co/spaces/dalle-mini/dalle-mini/discussions/11. And that the Spaces demo page may sample a few more images depending on traffic. https://huggingface.co/spaces/dalle-mini/dalle-mini

Both of these are "minor" improvements compared to the full model, so I'm still interested in how to create and share a better model. Eleutherai has a repo for training larger models, but says there are "No pretrained models... Yet." https://github.com/EleutherAI/DALLE-mtf https://github.com/EleutherAI/DALLE-mtf/issues/9 "No pertained models... yet" #9