cocodataset / panopticapi

COCO 2018 Panoptic Segmentation Task API (Beta version)
Other
418 stars 184 forks source link

Underestimation of SQ? #14

Open tambetm opened 5 years ago

tambetm commented 5 years ago

When averaging SQ over classes, even those classes are taken into account where TP=0, i.e. which were not even recognized correctly. For those classes SQ=0, so they reduce average SQ considerably. At least for me this breaks the intuition of SQ being a metric that measures how well the segmentation matches ground truth. Consider this example:

class IOU TP FP FN PQ SQ RQ
class 1 0.88 1 0 1 0.59 0.88 0.67
class 2 0.69 1 0 0 0.69 0.69 1
class 3 0.63 1 0 0 0.63 0.63 1
class 4 0 0 1 1 0 0 0
class 5 0 0 1 0 0 0 0
average 0.38 0.44 0.53

Even though per-class segmentation results were decent for first three classes, when averaged over all classes 0.44 seems unfair. Following the SQ calculation algorithm one could wonder how SQ can ever be smaller than 0.5. Simple solution would be to count only non-zero SQ results when averaging, i.e. in this case the average SQ would be 0.73.

At first I thought that this would break the nice PQ=SQ*RQ formula. But then I realized that averaging breaks it anyway. I understand that this brings up the question how to average PQ as well and it would be nice if the rules would be consistent. That's why I'm posting this as an issue to discuss rather than pull request.