yanghb22-fdu / Hi3D-Official

[MM24] Official codes and datasets for ACM MM24 paper "Hi3D: Pursuing High-Resolution Image-to-3D Generation with Video Diffusion Models".
MIT License
199 stars 12 forks source link

multiple images #3

Open unnamed333user opened 1 month ago

unnamed333user commented 1 month ago

tks for sharing such a great work! I have a question that how can the model generate the back side of the object if we input only front side image of the object ?

yanghb22-fdu commented 1 month ago

Our work begins with SVD. We assume that SVD has processed a large volume of multi-view images during training. Building on this, we further trained our model on the multi-view dataset from Objaverse to fully leverage SVD's capability for generating multi-view images. For more details, please refer to our paper.

Vincento-Wang commented 1 month ago

tks for sharing such a great work! I have a question that how can the model generate the back side of the object if we input only front side image of the object ?

I think the back views is random but natural generations by stable video diffusion or it is overfitting by the Objaverse. I see many work on dataset Objaverse, and there are all not that real.