ashawkey / stable-dreamfusion

Text-to-3D & Image-to-3D & Mesh Exportation with NeRF + Diffusion.
Apache License 2.0
8.21k stars 721 forks source link

what can i do >Alleviate the multi-face Janus problem #37

Open liwei0826 opened 1 year ago

liwei0826 commented 1 year ago

image

thuanz123 commented 1 year ago

Seem like stable diffusion has a bias to show face on every angles even overhead view (like below image) so I think better diffusion model or better prompt is the only solution image

lalalune commented 1 year ago

This has been considered in other work for direct generation: https://3d-diffusion.github.io -- it may be possible to apply some of the concepts to Dreamfusion. I think that we could fine tune SD on synthetic images of people, faces and common objects matching the labels "side view, back view, top view" etc that are currently used, and this fine tune couid help to inform Janus issues. We may also be able to do negative prompt weights for things like faces, front, etc on the other projected views.

blindcrone commented 1 year ago

One thing that jumps out at me is that the "side" views currently being used in the cube are the same prompt. Even though the distinction between e.g. "view from the left/right side" or "viewed from the east/west" might be arbitrary, it will probably do better at not duplicating features than both "sides" of the cube being generated by "side view"

Along the same lines of improving the consistency of SD input, it might be worth making each "training step" cubic batch of images generate from the same "seed" in SD to improve the generated consistency