hustvl / Symphonies

[CVPR 2024] Symphonies (Scene-from-Insts): Symphonize 3D Semantic Scene Completion with Contextual Instance Queries
https://arxiv.org/abs/2306.15670
MIT License
168 stars 6 forks source link

SSCBench-KITTI360 test results #24

Open joonsu0109gh opened 4 months ago

joonsu0109gh commented 4 months ago

Hello, Thank you for the great work.

I have a question regarding the performance on the KITTI-360 dataset. Are the numbers you provided from the test set correct?

Based on the logs you shared, it seems that the results are from the validation set. In my experiments using your model weights, I obtained 'val IoU: 0.1835' and 'test IoU: 0.1782.' Could you please clarify this?

npurson commented 4 months ago

Thank you for bringing this to our attention!

To clarify, I have modified my local code to evaluate the test set as the validation set for convenience. As a result, the validation metrics reported in the log are actually the test metrics. However, I noticed that the dataset has been updated since we conducted our experiments [1], so I'm unsure if this is causing the discrepancy.

We will further investigate this issue, but unfortunately, it may take some time due to the need to re-prepare the entire dataset and our current workload. If you are urgent to report the results for comparison, you can first report your reproduced result of our method temporarily.

npurson commented 4 months ago

Hi, one more thing to confirm: have you adapted the disparity value for KITTI-360 as discussed here?

joonsu0109gh commented 4 months ago

Hi, thank you for your detailed answer.

I'll check that and report back to you.

g-ch commented 4 months ago

Hi all,

@npurson @joonsu0109gh Could you share the KITTI-360 depth images generated with with pre-trained MobileStereoNet described in README? I tried to run the script but my hardware is not so good and it had memory overflowing issue.

Thank you in advance!

npurson commented 4 months ago

Hi all,

@npurson @joonsu0109gh Could you share the KITTI-360 depth images generated with with pre-trained MobileStereoNet described in README? I tried to run the script but my hardware is not so good and it had memory overflowing issue.

Thank you in advance!

I'm afraid that the depth images are too large (~ 10G) to upload. Also, it's worth noting that the inference of MobileStereoNet shouldn't consume a lot of memory—it's much less than what our network requires. Perhaps there might be some problems with your running scripts?

g-ch commented 3 months ago

Hi all, @npurson @joonsu0109gh Could you share the KITTI-360 depth images generated with with pre-trained MobileStereoNet described in README? I tried to run the script but my hardware is not so good and it had memory overflowing issue. Thank you in advance!

I'm afraid that the depth images are too large (~ 10G) to upload. Also, it's worth noting that the inference of MobileStereoNet shouldn't consume a lot of memory—it's much less than what our network requires. Perhaps there might be some problems with your running scripts?

Hi @npurson ,

Thank you so much for the reply! You are right. It's because my GPU is relatively new (3080) and the environment yaml file given by MobileStereoNet uses low version CUDA and pytorch. I rebuild the environment and upgrade to new versions and it works.

Samsara011 commented 1 month ago

@npurson Hello author, thank you for your work, please ask how much video memory is needed for the entire model training! thanks!

npurson commented 1 month ago

@npurson Hello author, thank you for your work, please ask how much video memory is needed for the entire model training! thanks!

If you're referring to GPU memory, it's ~18GB.

Samsara011 commented 1 month ago

@npurson I see,thanks!