Closed AmanKishore closed 2 years ago
Hi @AmanKishore We're doing this too, but we have opted for a 2D side-view generation using standard Stable Diffusion (with tags to ensure a white background) that then gets passed in here.
We are also experimenting with 3D Stable Dreamfusion (https://github.com/ashawkey/stable-dreamfusion) but it is much much slower and lower quality results. I believe that using SD to generate your image and then running it through GET3D will be 10-20x faster and higher quality result, provided you have sufficiently trained the model on similar object classes.
Hi, @AmanKishore ,
That's definitely one thing on my plan! Unfortunately, I may not have enough time to release the fine-tuned model on CLIP for text-guided 3D synthesis before CVPR, I expect this part will be released after CVPR when I get more free time to clean the code.
Hi @lalalune ,
It would be awesome to try this direction! Please let me know if you have any cool results on it!
That's great! I'd love to chat @lalalune! Sounds like an interesting use case! DM'd you on Twitter! @SteveJunGao Quick question are you planning to release CLIP for text-guided 3D synthesis after June (post CVPR)?
Hi @AmanKishore,
I didn't have an exact timeline for this part right now, if you urgently need that part I recommend you go to this codebase, our implementation for text-guided 3D synthesis is based on this code.
Thank you! And was the training data mostly Shapenet?
Yes, the training data is mostly shapenet!
Hi @SteveJunGao thank you for all your help! When you release the pretrained model (Issue #16) will you also be releasing the weights for the fine-tuned model on CLIP? Excited to continue working on this!