Open lippoldt opened 7 months ago
Hi @lippoldt thank you for reaching out. For Table 3 we use a different threshold for computing the F-Score. This is clarified in the appendix of our paper, shown as below:
To evaluate with this F-Score, would you please change metric_names=MeshEvaluator.ESSENTIAL_METRICS
in the following lines:
into metric_names=['f-score-outdoor']
and try again? The reason why we use a different score is because the scale of the datasets are essentially different.
Sorry for the delay in response and I am happy to assist you with further questions.
I am currently testing capabilities of NKSR and for that purpose was running some tests on scores. I have downloaded the CARLA data you provided and executed the test script together with the metrics provided. The metrics on the pretrained backbone of CARLA are as follows:
According to the paper, the F-score should be above 0.9.
I have also been testing the training procedure for the CARLA data - and while validation accuracies look very promising (they are also above 90%), test f-scores are again low.
I have also been switching the precision from 32 to 64, however have not achieved any large improvement.
How can I reproduce the numbers from the paper?