3d-vista / 3D-VisTA

Official implementation of ICCV 2023 paper "3D-VisTA: Pre-trained Transformer for 3D Vision and Text Alignment"
https://3d-vista.github.io
MIT License
189 stars 11 forks source link

Questions about the pc_type and scanrefer_metrics #12

Closed Xiaolong-RRL closed 1 year ago

Xiaolong-RRL commented 1 year ago

Dear author:

When I run 3D-VisTA with the command python run.py --config project/vista/scanrefer_config.yml, the following two issues happened:

  1. After set pc_type to pred, I can not find the save_mask folder under data/scanfamily
  2. In the eval pipeline of scanrefer, the value of og_Acc_Iou25 and og_Acc_Iou50 is exactly the same, but theoretically the value of og_Acc_Iou25 should be higher

I wonder how to solve these two problems?

Best Xiaolong

shengjie-lin commented 1 year ago

I second this. I would really appreciate it if you can provide us with your object detection results from mask3d. This is crucial for us to reproduce the performance reported in your paper, and also enables a fair comparison with other methods by using exactly the same object proposals. Thank you so much in advance!

zhuziyu-edward commented 1 year ago

Here is the predicted mask, you can use it for evaluation. https://drive.google.com/file/d/1w9m3lCW67Tul8qbkES7paX0cFXjblQ-C/view?usp=share_link @shengjie-lin @Xiaolong-RRL

zhuziyu-edward commented 1 year ago

You should set pc_type to 'pred' to get results of iou25 and iou50. @Xiaolong-RRL

shengjie-lin commented 1 year ago

Thank you so much for providing the predictions from mask3d! I really appreciate your responsiveness! I have a follow-up question. I also tried to use mask3d to generate the predictions, but our results seem to be worse than yours. I wonder if there are any post-processings that you do with the mask3d predictions? From what I see in your code, it seems that you

  1. store the top 100 scored object predictions per scannet scene (as in your provided files)
  2. only use the top 50 scored object predictions

So my question is, do you also perform any sort of point-cloud processing, or directly use the predicted ones from mask3d? Thank you!

zhuziyu-edward commented 1 year ago

Yes, you should use DBSCAN to process the output mask, this is already implemented in Mask3D and you can use it in their repo.

WeitaiKang commented 9 months ago

@shengjie-lin

Hi, did you finally implement these predictions from Mask3D. As in #24 , I am now finetuning this project in my private dataset, and I want to get all the segmentation results in ScanNet. Would you be kind to share with me?

shengjie-lin commented 9 months ago

Hi @WeitaiKang , I used the provided predictions by the author as in the google drive link above. I also generate the predictions myself with Mask3D, following Mask3D's instructions. I did not implement anything for this purpose.

WeitaiKang commented 9 months ago

@shengjie-lin Hi~ would you mind to share with me your Mask3D prediction results?

Because the file in google drive only includes the preditions on the senes in ScanRefer, missing other scenes of ScanNet, which is needed in my private dataset.

shengjie-lin commented 9 months ago

@WeitaiKang , I am sorry but (1) I also don't generate the predictions for other scenes of ScanNet; and (2) I already deleted my predictions as I don't need them, now that I have the predictions provided by the authors.

WeitaiKang commented 9 months ago

@shengjie-lin All right. Thank you anyway.