alievk / npbg

Neural Point-Based Graphics
MIT License
324 stars 51 forks source link

Why can't directly use .ply file provided by Scannet Dataset #4

Closed ousuixin closed 4 years ago

ousuixin commented 4 years ago

Hello, thank you for your excellent work and the open source code! Following the guild in readme, I can successfully fit a new scene by building the reconstruction with Agisoft Metashape Pro and then fitting descriptors. However, when I directly use the reconstructions (eg. http://kaldir.vc.in.tum.de/scannet/v2/scans/scene0001_01/scene0000_00_vh_clean.ply) provided in ScanNet dataset, I found that I can not fitting descriptors correctlly then. Are there some differences between the pointcloud built by Agisoft Metashape Pro and the pointcloud provided by ScanNet dataset? And how can I fit a scene in Scannet dataset such as scene0000_00 with the pointcloud provided by ScanNet dataset? Thank you for your reply!

ousuixin commented 4 years ago

I finally found that the above problem was caused by the differences between the camera matrices provided by Agisoft Metashape Pro and ScanNet dataset instead of the pointcloud.

I solved problem above by edit the pose file provided by ScanNet. *Just change the sign for the second and third column in all 'pose/.txt'**

For example, in '0.txt' in scene0000_00, it has the following content:

        -0.955421 **0.119616 -0.269932** 2.655830
        0.295248 **0.388339 -0.872939** 2.981598
        0.000408 **-0.913720 -0.406343** 1.368648
        0.000000 **0.000000 0.000000** 1.000000

I change the sign for the second and third column, so I get:

        -0.955421 **-0.119616 0.269932** 2.655830
        0.295248 -**0.388339 0.872939** 2.981598
        0.000408 **0.913720 0.406343** 1.368648
        0.000000 **0.000000 0.000000** 1.000000

And then, I can fitting descriptors correctlly for scene0000_00 in ScanNet dataset. However I still want to Know why I can solve my problem by steps above. What is the meanning for sign reverse in the second and third column for a camera matrix (actually sign reverse in the second and third column for the rotation matrix)?

seva100 commented 4 years ago

@ousuixin, you're right that this correction might be needed; we also reverse the sign for the second and the third column automatically when view matrices are loaded from an XML file produced by Agisoft Metashape (see L206). I believe this is done to transform the coordinate system to the OpenGL conventional format, but most likely @alievk could give a more elaborated explanation.

alievk commented 4 years ago

Correct. The ScanNet dataset provides extrinsics in OpenCV coordinate frame, +X right, +Y up, +Z forward.

However, we use OpenGL coordinate frame, which is rotation of OpenCV frame by 180 degrees around X axis, which is effectively invention of Y and Z axis.

ousuixin commented 4 years ago

Appreciate for your reply!