hustvl / Symphonies

[CVPR 2024] Symphonies (Scene-from-Insts): Symphonize 3D Semantic Scene Completion with Contextual Instance Queries
https://arxiv.org/abs/2306.15670
MIT License
168 stars 6 forks source link

Does the SemanticKITTI dataset provide depth ground truth? #25

Open UQHTy opened 1 month ago

UQHTy commented 1 month ago

Hi, brilliant work. I would like to know if the SemanticKITTI dataset provides official depth GT? I need to utilize accurate depth information. Many thanks

npurson commented 1 month ago

Thanks for reaching out! The SemanticKITTI dataset doesn't come with official depth ground truth. But since it's based on the KITTI dataset, you might be able to use the depth labels from KITTI. Just keep in mind that I'm not totally sure how well that would work. Good luck with your project!

UQHTy commented 1 month ago

Thanks for reaching out! The SemanticKITTI dataset doesn't come with official depth ground truth. But since it's based on the KITTI dataset, you might be able to use the depth labels from KITTI. Just keep in mind that I'm not totally sure how well that would work. Good luck with your project!

@npurson Thanks for your prompt reply, I have referenced the functions pix2cam and cam2vox from Symphonies/ssc_pl/models/utils/utils.py and applied them using LiDAR depth value. However, I am unable to correctly establish the correspondence between the image pixels and the voxels. When I project the depth values using the camera intrinsics, extrinsics, voxel, and scene scale, the resulting voxels do not align with the occupancy coordinate system. There appears to be some coordinate shift, causing a mismatch with the occupancy ground truth.

Could you please help me understand what might be causing this issue?