Tangshitao / MVDiffusion

MVDiffusion: Enabling Holistic Multi-view Image Generation with Correspondence-Aware Diffusion, NeurIPS 2023 (spotlight)
506 stars 27 forks source link

question about dataset for panprama images generation #8

Open HL4214 opened 1 year ago

HL4214 commented 1 year ago

Thank you for sharing your work. After reading paper, I have a confusion about Matterport3D dataset.

Matterport3d only provided surface reconstructions, camera poses, and 2D and 3D semantic segmentations, and no text prompt, so I want to know how to get the text prompt for each panorama image. Is the scanNet dataset same?

Tangshitao commented 1 year ago

I use blip2 https://github.com/salesforce/LAVIS/tree/main to generate prompts.

HL4214 commented 1 year ago

Directly a whole panorama as input to bilp2? Or is it cropped for multiple images as input?

Tangshitao commented 1 year ago

Convert a panorama into 8 perspective images and process each perspective independently.