Closed aoyanl closed 3 years ago
@aoyanl Thanks for your question. Actually, only a few categories are labeled in images of a typical dataset, and these labeled categories can not cover all actual categories in this dataset. For example, in PASCAL VOC 2012, only 20 foreground classes are labeled, but other unknown categories may be contained in the background or may not be annotated in images. In zero-shot segmentation, unseen categories can just be considered as a special case of this phenomenon. Therefore, it is fair to contain unseen classes in 'ignored' pixels. In addition, we do not remove images that contain any unseen objects, which preserves the integrity of the original segmentation dataset to the greatest extent and reduces waste.
why can images containing seen classes be used as training samples? Although these pixels belong to unseen classes are marked as 'ignored', the network can ultilize these pixel information as context . So why not follow zero-shot detection , images that contain any unseen objects are removed from the training set? Thanks very much! Looking forward to your reply and dispelling my question.