How to control the output image to be in those six viewpoints without showing the input camera position？

SUDO-AI-3D / zero123plus

Code repository for Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model.

Apache License 2.0

1.56k stars 108 forks source link

How to control the output image to be in those six viewpoints without showing the input camera position？ #43

Open sandydf opened 7 months ago

sandydf commented 7 months ago

Thanks for releasing the code! I would like to know how to control the output image to be in those six viewpoints without showing the input camera position.

eliphatfs commented 7 months ago

Sorry but I didn't quite get your question. Could you use an example or elaborate more? Thanks.

sandydf commented 7 months ago

Thank you for your reply! I want to know how you control the output image to be the fixed six viewing angles without generating images outside the six viewing angles. In other issue I see you do not explicitly use any camera pose input during training or inference, but I'm not quite sure how you control the synthesis of the novel views from those six specific angles.

sandydf commented 7 months ago

I have another question: why does the decoding of latents directly result in one large image that includes six novel views? I don’t quite understand how this is achieved. I look forward to your reply, thank you!

eliphatfs commented 7 months ago

Because the model is trained like that; and the model is able to infer the fixed novel view angles from the given input image.

sandydf commented 7 months ago

Thank you for your reply, could you tell me how to train to get the model to have this ability?