Reproducing benchmark for CARLA

I am currently testing capabilities of NKSR and for that purpose was running some tests on scores. I have downloaded the CARLA data you provided and executed the test script together with the metrics provided. The metrics on the pretrained backbone of CARLA are as follows:

completeness (68) 0.027692637713934216
accuracy (68) 0.03867699400040561
normals completeness (68) 0.965071364518667
normals accuracy (68) 0.9516956805795869
normals (68) 0.9583835225491268
completeness2 (68) 0.025281473814217505
accuracy2 (68) 0.007717492560607829
chamfer-L2 (68) 0.016499483187412674
chamfer-L1 (68) 0.03318481585716991
f-precision (68) 0.2145197705882353
f-recall (68) 0.28277397766826823
f-score (68) 0.24255425021785418
f-score-15 (68) 0.4549117034594062
f-score-20 (68) 0.6125048929714518

According to the paper, the F-score should be above 0.9.

I have also been testing the training procedure for the CARLA data - and while validation accuracies look very promising (they are also above 90%), test f-scores are again low.

I have also been switching the precision from 32 to 64, however have not achieved any large improvement.

How can I reproduce the numbers from the paper? Screenshot (1405)

nv-tlabs / NKSR

Reproducing benchmark for CARLA #66