nv-tlabs / NKSR

[CVPR 2023 Highlight] Neural Kernel Surface Reconstruction
https://research.nvidia.com/labs/toronto-ai/NKSR
Other
735 stars 43 forks source link

Reproducing benchmark for CARLA #66

Open lippoldt opened 7 months ago

lippoldt commented 7 months ago

I am currently testing capabilities of NKSR and for that purpose was running some tests on scores. I have downloaded the CARLA data you provided and executed the test script together with the metrics provided. The metrics on the pretrained backbone of CARLA are as follows:

completeness (68) 0.027692637713934216
accuracy (68) 0.03867699400040561
normals completeness (68) 0.965071364518667
normals accuracy (68) 0.9516956805795869
normals (68) 0.9583835225491268
completeness2 (68) 0.025281473814217505
accuracy2 (68) 0.007717492560607829
chamfer-L2 (68) 0.016499483187412674
chamfer-L1 (68) 0.03318481585716991
f-precision (68) 0.2145197705882353
f-recall (68) 0.28277397766826823
f-score (68) 0.24255425021785418
f-score-15 (68) 0.4549117034594062
f-score-20 (68) 0.6125048929714518

According to the paper, the F-score should be above 0.9.

I have also been testing the training procedure for the CARLA data - and while validation accuracies look very promising (they are also above 90%), test f-scores are again low.

I have also been switching the precision from 32 to 64, however have not achieved any large improvement.

How can I reproduce the numbers from the paper? Screenshot (1405)

heiwang1997 commented 6 months ago

Hi @lippoldt thank you for reaching out. For Table 3 we use a different threshold for computing the F-Score. This is clarified in the appendix of our paper, shown as below:

image

To evaluate with this F-Score, would you please change metric_names=MeshEvaluator.ESSENTIAL_METRICS in the following lines:

https://github.com/nv-tlabs/NKSR/blob/0d4e369b1ee641204d6e6d2b53c692fed6273ca5/models/nksr_net.py#L301-L303

into metric_names=['f-score-outdoor'] and try again? The reason why we use a different score is because the scale of the datasets are essentially different.

Sorry for the delay in response and I am happy to assist you with further questions.