Which "Stable Diffusion Image Variations Model" you fine-tuned?

xxlong0 / Wonder3D

Single Image to 3D using Cross-Domain Diffusion for 3D Generation

https://www.xxlong.site/Wonder3D/

GNU Affero General Public License v3.0

4.49k stars 351 forks source link

Which "Stable Diffusion Image Variations Model" you fine-tuned? #131

Open BlingHe opened 5 months ago

BlingHe commented 5 months ago

Hi! Thanks the authers sharing the great work!

As you mentioned in the paper Sec. 5.1, May I ask which "Stable Diffusion Image Variations Model" you fine-tuned? Could you provide the link to this pre-trained model?

xxlong0 commented 5 months ago

You may find details here: https://huggingface.co/lambdalabs/sd-image-variations-diffusers

BlingHe commented 5 months ago

You may find details here: https://huggingface.co/lambdalabs/sd-image-variations-diffusers

I noticed that the "in_channels" of this image variations model is 4. But, your unet model needs 8 in_channels for additional "image_latent". How did your modified unet model be trained? Joint training with domain switcher and cross-domain attention or pre-training before training other modules?

Thanks in advance!