yangyanli / PointCNN

PointCNN: Convolution On X-Transformed Points (NeurIPS 2018)
https://arxiv.org/abs/1801.07791
Other
1.37k stars 365 forks source link

About the distinction between overall accuracy and micro-averaged accuracy #140

Open TSchattschneider opened 5 years ago

TSchattschneider commented 5 years ago

Hello,

in your paper you describe the use of several metrics for the segmentation task, two of those being the overall accuracy (OA) and micro-averaged accuracy (mAcc). My question is how you define those metrics and how you make a distinction between them, with my point being that these two metrics should essentially be the same. Overall accuracy is a case of micro-averaged accuracy.

Generally speaking, for the overall accuracy you would calculate the ratio of just how many points of all points have been classified correctly. On the other hand, the micro-averaged accuracy is defined as the accuracy you get by equally weighting each point in your evaluation set, aggregating your prediction outcomes across all label classes and then computing the accuracy using the aggregate outcomes, which would be exactly the same as the overall accuracy that I described above. That's how I understand those metrics in the given context of point clouds.

So could you tell me your understanding of these metrics? Also, how did you calculate them for Table 2 and Table 3 of your appendix? Did you perhaps mix up the terms of micro-averaged accuracy with macro-averaged accuracy (which would be the mean per class accuracy in this case)? Or is my understanding just wrong?

Best regards, Thomas

TSchattschneider commented 5 years ago

Okay, I may have found the source of the problem. For your tables 2 and 3 of your supplementary material, you have probably looked at the data from in Large-scale Point Cloud Semantic Segmentation with Superpoint Graphs and their table 3 in particular. There they state exactly the metric scores which you have shown in your paper as well. They use the "mAcc" metric there and explain what it means in Section "4. Experiment":

Performance is evaluated using three metrics: per-class intersection over union (IoU), per-class accuracy (Acc), and overall accuracy (OA)...

Prepending a small m to the metric makes it the mean of that metric, as they explicitly define it in regards of the IoU and mIoU. This means that mAcc turns out to be the mean per-class accuracy (which is a macro-average and not a micro-average). If this is really the case, then your table captions regarding mAcc and mIoU would need to be updated.

I don't want to sound nitpicky, this was just important for my evaluations and I stumbled upon it.