Tangshitao / MVDiffusion

MVDiffusion: Enabling Holistic Multi-view Image Generation with Correspondence-Aware Diffusion, NeurIPS 2023 (spotlight)
446 stars 21 forks source link

question about dataset for panprama images generation #8

Open HL4214 opened 10 months ago

HL4214 commented 10 months ago

Thank you for sharing your work. After reading paper, I have a confusion about Matterport3D dataset.

Matterport3d only provided surface reconstructions, camera poses, and 2D and 3D semantic segmentations, and no text prompt, so I want to know how to get the text prompt for each panorama image. Is the scanNet dataset same?

Tangshitao commented 10 months ago

I use blip2 https://github.com/salesforce/LAVIS/tree/main to generate prompts.

HL4214 commented 10 months ago

Directly a whole panorama as input to bilp2? Or is it cropped for multiple images as input?

Tangshitao commented 10 months ago

Convert a panorama into 8 perspective images and process each perspective independently.