Training ControlNet parameters instead of finetuning

xxlong0 / Wonder3D

Single Image to 3D using Cross-Domain Diffusion for 3D Generation

GNU Affero General Public License v3.0

4.71k stars 373 forks source link

Firstly, thanks for this excellent work.

After reading the paper and experimenting with the code, I thought I'd drop a suggestion. Rather than altering a pretrained LDM model (Stable Diffusion) directly, and fine-tuning weight to account for the additional camera pose and domain, it might be beneficial to instead tune a separate set of UNet parameters (as is done in the ControlNet architecture (https://github.com/lllyasviel/ControlNet) to prevent deterioration of unconditioned model output.

Apologies for making this suggestion in a Github issue - but I didn't see contact info on your site/paper.

xxlong0 / Wonder3D

Training ControlNet parameters instead of finetuning #105