ashawkey / stable-dreamfusion

Text-to-3D & Image-to-3D & Mesh Exportation with NeRF + Diffusion.
Apache License 2.0
8.05k stars 713 forks source link

Implementation of Imagen Model instead of Stable Diffusion #105

Closed AngryChihuahua04 closed 1 year ago

AngryChihuahua04 commented 1 year ago

I love the current implementation of Stable Diffusion in this project, however, the Imagen model that was used previously has been stated to yield better results. There is a Pytorch implementation of Imagen that can be found here: https://github.com/lucidrains/imagen-pytorch Would there be any way to allow a custom-trained Imagen model to be used to render the 2D image that is then converted to 3D instead of the Stable Diffusion model?

Additional context A lot of this ML stuff goes way over my head as I only understand the basics, so if my understanding of how Dreamfusion works is incorrect or the answer to my question is not possible, please let me know.

ashawkey commented 1 year ago

@AngryChihuahua04 Hi, the problem is that the imagen implementation does not have a pretrained model on large datasets. The only publicly available text-to-2D model is stable-diffusion now.

neverix commented 1 year ago

dalle-pytorch and laionide have small pretrained pixel-level models

AngryChihuahua04 commented 1 year ago

Ok, that makes sense. Hopefully an open source one will be released soon. Stable diffusion is probably better than any custom Imagen model anyway.