minghanqin / LangSplat

Official implementation of the paper "LangSplat: 3D Language Gaussian Splatting" [CVPR2024 Highlight]
https://langsplat.github.io/
Other
526 stars 54 forks source link

Querying the most relevant 3D Gaussians #26

Open xxlbigbrother opened 3 months ago

xxlbigbrother commented 3 months ago

I couldn't find any code for inputting text and querying the most relevant 3D Gaussians in the code repository. Will it be provided later?

minghanqin commented 3 months ago

Thanks for your attention. The eval code has been released.

xxlbigbrother commented 3 months ago

Thanks for your attention. The eval code has been released.

Thank you for your quick code update, but I found that the ground truth of lerf_ovs is on 2d, so how can we achieve 3D Object Localization? I still don’t know how to query the original 3D gaussian points. Thank you for your help.

Li-Wanhua commented 3 months ago

Thank you for your attention to our work.  To achieve 3D text querying, there can be two approaches. The first method, as you mentioned, directly computes the similarity between 3D Gaussian points and text queries. The second method first renders 3D language Gaussian onto a 2D image plane using Gaussian Splatting, then computing similarity between text queries and language features on the 2D image pixels.  Previous SOTA works like LERF adopted the second method because NeRF's implicit modeling prevented the use of the first method. To ensure a fair comparison, we also employed the second method. However, our approach can indeed be tested using the first method, and we will explore it in the future to see if it yields better performance.  I hope this explanation addresses your questions.

xxlbigbrother commented 1 week ago

Thanks for your kind help!