CVMI-Lab / PLA

(CVPR 2023) PLA: Language-Driven Open-Vocabulary 3D Scene Understanding & (CVPR2024) RegionPLC: Regional Point-Language Contrastive Learning for Open-World 3D Scene Understanding
Apache License 2.0
262 stars 11 forks source link

The 2D-3D projection on S3DIS #13

Closed zhang-zihui closed 1 year ago

zhang-zihui commented 1 year ago

Dear Runyu and Jihan,

Thanks for this inspiring work, and I'm curious about the 2D-3D projection on S3DIS. In the provided "s3dis_view_vit-gpt2_matching_idx.zip" data, it seems already included the corresponding index, how to get them?

regards, zihui

Dingry commented 1 year ago

Hi, the code for generating the caption index for S3DIS hasn't been released yet. We will release it soon.

HalvesChen commented 1 year ago

@Dingry Thanks, I await your release :)

Pixie8888 commented 1 year ago

Hi,

When do you plan to release the code for generating the caption in S3DIS?

Dingry commented 1 year ago

Hi, this is a script that you can use for now. We are working on an optimized projection method and integrating it into the code. Please note that you need to install PointGroup from [this source] before running the script. The code is partly based on [this link]. map_to_pc_s3dis.py.zip

Dingry commented 1 year ago

Hello, we have also implemented an optimized matching index computing method through point-to-image projection. You can find the code here: [https://github.com/CVMI-Lab/PLA/commit/bb0aaa549c98f9019be27a890270203d374bd596]. This code is still under testing and we will evaluate it in a week. We hope this is helpful to you. If you have any questions or issues regarding the code, please feel free to contact us. Thanks for your patience.

Pixie8888 commented 1 year ago

Hi, this is a script that you can use for now. We are working on an optimized projection method and integrating it into the code. Please note that you need to install PointGroup from [this source] before running the script. The code is partly based on [this link]. map_to_pc_s3dis.py.zip

Thank you for your code! I have some questions regarding the intrinsic and extrinsic provided in s3dis.

  1. Is "data['camera_k_matrix']" the intrinsic for both rgb and depth images?
  2. Is the camera pose "data['camera_rt_matrix']" provided by s3dis the world-to-camera transformation? Is the extrinsic the same as the image's ? image
Dingry commented 1 year ago

Yes. The data['camera_k_matrix'] is the intrinsic matrix and the data['camera_rt_matrix'] is the extrinsic matrix. In my impression, they are the same for image and depth images.

xuxiaoxxxx commented 1 year ago

Hi, after I use the code to project the pc into image, I find there are some wrong. The left picture is target image. The image on the right is the imaging of the point cloud projected onto the image. You can see that the projected image is offset compared to the target image. And the point cloud I used is stanford_indoor3d_inst.zip which is provided by you. Should I need to multiply by a rotation matrix to fix? Also I find your thresholds of depth occlusion seems large, should I need to decrease it? image

Dingry commented 1 year ago

Hi, can you tell me the scene name and the image name? I will look into this issue. Thanks!

intro965 commented 1 year ago

Hi, can you tell me the scene name and the image name? I will look into this issue. Thanks!

The file name of the image maybe camera_0ccf3c78ef354902b516c62ef8fb7cf1_lounge_2_frame_12_domain_rgb.png in area3.

xuxiaoxxxx commented 1 year ago

Hi, can you tell me the scene name and the image name? I will look into this issue. Thanks!

Hi, I change the S3dis dataset version and have slove this problem. Thanks for your reply!

Areeba-Raza commented 1 year ago

Hello ! Than k you for you work. Can you please the scripts to generate the captions for the S3DIS dataset? Because I belive you have left that part unimplemented in the scripts about which you mentioned in Dataset.md file. Many thanks for your time