pals-ttic / sjc

Score Jacobian Chaining: Lifting Pretrained 2D Diffusion Models for 3D Generation (CVPR 2023)
https://pals.ttic.edu/p/score-jacobian-chaining
Other
504 stars 15 forks source link

Different angles in SD #10

Open FarggrossenOskar opened 1 year ago

FarggrossenOskar commented 1 year ago

I'm curious how you get the same image for each angle? If I were to write "chair front view", "chair side view", chair back view" etc in SD it will give me entirely different chairs in each image I generate. So how does this system generate a chair that looks the same in each reference image from different angles?

w-hc commented 1 year ago

The underlying 3D representation (NeRF) is constraining the system to provide view consistency. It is true that at each iteration, the guidance provided by the 2D diffusion is pretty random. But over many iterations and viewpoints, those conflicting signals are merged and resolved in the 3D body. The ultimate end goal of 2D diffusion is to make things look realistic, and towards that goal it will play along with whatever is rendered and presented to it.

A great analogy I like the most is SDEdit by Meng et. al. (I provided links & discussion in the project website). It shows how diffusion models can be very cooperative, and intervention-friendly.