Closed xieenze closed 4 years ago
Hi,
This is a great question.
Indeed, COCO only has 80 classes, but they are not contiguous in their class indices. Indeed, as you can see from the colab notebook (which contains the class names), there are many N/A
classes
CLASSES = [
'N/A', 'person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus',
'train', 'truck', 'boat', 'traffic light', 'fire hydrant', 'N/A',
'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse',
'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe', 'N/A', 'backpack',
'umbrella', 'N/A', 'N/A', 'handbag', 'tie', 'suitcase', 'frisbee', 'skis',
'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove',
'skateboard', 'surfboard', 'tennis racket', 'bottle', 'N/A', 'wine glass',
'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple', 'sandwich',
'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake',
'chair', 'couch', 'potted plant', 'bed', 'N/A', 'dining table', 'N/A',
'N/A', 'toilet', 'N/A', 'tv', 'laptop', 'mouse', 'remote', 'keyboard',
'cell phone', 'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'N/A',
'book', 'clock', 'vase', 'scissors', 'teddy bear', 'hair drier',
'toothbrush'
]
The simplest way to handle this is just to consider the max class id (in this case it's 90), which would make for 91 classes.
This is the approach that torchvision does as well, and simplifies the rest of the code as we don't need to carry a class mapping, like we did for maskrcnn-benchmark
or detectron2
.
I hope I have clarified this point, as as such I'm closing the issue, but let me know if you have further questions.
Hi, fmassa, Thank you for your quick reply. (1)I want to confirm that 'the valid categories are 80'. But you set it=90+1(bg). Does it indicate that there are 10 categories that are invalid(e.g. "N/A", also no data belong to these categories)?
(2)If one does class mapping(e.g. 'mmDetection', the num_class=80, no invalid categories), the network output is also 80(or 81). Will your approach(having 10 invalid categories) have different performance with mmDetection(remove invalid categories, only 80 valid categories)?
@xieenze
Does it indicate that there are 10 categories that are invalid(e.g. "N/A", also no data belong to these categories)?
Yes, this means that there are 10 categories that do not have any corresponding training element, and thus are never predicted by the model.
Will your approach(having 10 invalid categories) have different performance with mmDetection(remove invalid categories, only 80 valid categories)?
From experiments with torchvision (which also uses 91 classes), the results match Faster R-CNN, so there is no difference in performance due to that. We just have a few more parameters in the model in the last classifier which are unused.
@fmassa Great, thank you. It really solved my question.
Hi, great work. I read your code but I found you set 'num_classes=91' for coco detection. But coco detection has 80 categories. May you explain why you set this=91? Thanks very much~