xxlong0 / Wonder3D

Single Image to 3D using Cross-Domain Diffusion for 3D Generation
https://www.xxlong.site/Wonder3D/
GNU Affero General Public License v3.0
4.71k stars 373 forks source link

Training ControlNet parameters instead of finetuning #105

Open mikeymezher opened 9 months ago

mikeymezher commented 9 months ago

Firstly, thanks for this excellent work.

After reading the paper and experimenting with the code, I thought I'd drop a suggestion. Rather than altering a pretrained LDM model (Stable Diffusion) directly, and fine-tuning weight to account for the additional camera pose and domain, it might be beneficial to instead tune a separate set of UNet parameters (as is done in the ControlNet architecture (https://github.com/lllyasviel/ControlNet) to prevent deterioration of unconditioned model output.

Apologies for making this suggestion in a Github issue - but I didn't see contact info on your site/paper.

flamehaze1115 commented 9 months ago

Hello. Thanks for your suggestions! Indeed, we don't conduct such experiments. I think your suggestion is worth trying. However, due to limited resources, we don't plan to do this in the near future. We welcome cooperations on this topic.