facebookresearch / ViewDiff

ViewDiff generates high-quality, multi-view consistent images of a real-world 3D object in authentic surroundings. (CVPR2024).
Other
311 stars 20 forks source link

pipeline question / paper question #16

Closed sararoma95 closed 5 months ago

sararoma95 commented 5 months ago
Screenshot 2024-05-07 at 14 12 57

Hello,

I’m reviewing the network proposed in your paper for rendering multiple images from a single/two images. I noticed in the pipeline diagram that the input and output images seem swapped between the top and bottom parts. Is this arrangement intentional, or could it be a diagram error?

Could you clarify:

  1. Whether the image arrangement in the diagram is intentional.
  2. The reasoning behind this configuration, if it's intentional.

Thank you for your help and for the interesting paper.

lukasHoel commented 5 months ago

Hi, thanks for the question. It's a slight inaccuracy in the pipeline figure. We compare the predicted and sampled noise for the same image.

sararoma95 commented 5 months ago

Thanks