Open gujiaqivadin opened 3 months ago
Due to the randomness in data pre-processing (point down-sampling), the performance on your local machine might be slightly different from the metrics we achieved.
We encourage you to train the whole model from the very begining (i.e. pre-train on detection -> then dense captioning) and see whether the results align.
For more details on the randomness analysis, please refer to https://github.com/ch3cook-fdu/Vote2Cap-DETR/issues/12.
@ch3cook-fdu hello, I'm also wondering why the result shown in paper is different from the officially reported benchmark on Scan2Cap Benchmark as below.
Would you tell me what's the difference between the paper's and the benchmark evaluation?
The reported results in the paper are m@kIoU evaluated on the ScanRefer validation set, while the official benchmark shows results of the test set. The metrics are also different.
Please refer to https://kaldir.vc.in.tum.de/scanrefer_benchmark/documentation and UniT3D paper equation 1 for more details.
Hello, do I need to find the best result in the log file by myself? Not the last evaluation displayed at the end of the run is the best result. Thank you!
Hello, @ch3cook-fdu!
Thanks for sharing your work about indoor 3d dense captioning. Recently I have tried to train the Vote2Cap-DETR(++) with different configs. I noticed that there is a slightly performance gap between metrics of (mine model)/(pretrained model of this repo) and (Table results in the paper).
Take scst_Vote2Cap_DETRv2_RGB_NORMAL with SCST settings for example:
My Results: ----------------------Evaluation----------------------- INFO: iou@0.5 matched proposals: [1543 / 2068], [BLEU-1] Mean: 0.6721, Max: 1.0000, Min: 0.0000 [BLEU-2] Mean: 0.5761, Max: 1.0000, Min: 0.0000 [BLEU-3] Mean: 0.4759, Max: 1.0000, Min: 0.0000 [BLEU-4] Mean: 0.3892, Max: 1.0000, Min: 0.0000 [CIDEr] Mean: 0.7539, Max: 6.2306, Min: 0.0000 [ROUGE-L] Mean: 0.5473, Max: 0.9474, Min: 0.1015 [METEOR] Mean: 0.2638, Max: 0.5982, Min: 0.0448
Pretrained Model Results ----------------------Evaluation----------------------- INFO: iou@0.5 matched proposals: [1548 / 2068], [BLEU-1] Mean: 0.6729, Max: 1.0000, Min: 0.0000 [BLEU-2] Mean: 0.5787, Max: 1.0000, Min: 0.0000 [BLEU-3] Mean: 0.4783, Max: 1.0000, Min: 0.0000 [BLEU-4] Mean: 0.3916, Max: 1.0000, Min: 0.0000 [CIDEr] Mean: 0.7636, Max: 6.3784, Min: 0.0000 [ROUGE-L] Mean: 0.5496, Max: 1.0000, Min: 0.1015 [METEOR] Mean: 0.2641, Max: 1.0000, Min: 0.0448
and Paper Results![11AE0D09-CEAA-45AD-BF94-6D4EE0E0FDB8](https://github.com/ch3cook-fdu/Vote2Cap-DETR/assets/37773691/a3804cff-5dc9-4f95-9a8d-430c3209980e)
About 1% ~2.5% performance gap exists in every different configs and settings, I am wondering how to figure it out.
Thanks, Jiaqi