Question about semantic segmentation.

kaitolucifer commented 1 year ago

I' using a custom semantic segmentation dataset. There are 5 categories used in annotation png files. 0 is background and 1~4 is stuff categories. I'm using a config file with settings like below.

MODEL:
  META_ARCHITECTURE: "MaskFormer"
  SEM_SEG_HEAD:
    NAME: "MaskFormerHead"
    IGNORE_VALUE: 0 # background
    NUM_CLASSES: 5
...

And training was fine. But when I use VisualizationDemo in demo/predictor.py to generate predictions and visualizations, I found that visualizations are wired. Category 1 contains both background and original category 1 areas.

The output predictions of model are 5xHxW maps and I use predictions["sem_seg"].argmax(dim=0) to get a HxW map which every pixel suppose to be 0~4. But I found that there are no background category which is 0. Does anyone know what went wrong?

adrian5m commented 1 year ago

The network is not trained to segment the background since IGNORE_VALUE is set to the background class. So it could be that the prediction values for your class 1 in these areas are also higher compared to the background or other classes so that the get grouped together in the argmax. You can set the IGNORE_VALUE to a class outside the scope of your labels or remove the line in the config (I think 255 is default) and see if that solves the problem after retraining.

XihuaQiao commented 10 months ago

Sorry to bother you, but did you solve this problem? I find that if you add background as a class to Mask2Former as Pascal VOC2012 does, which has 21 classes for semantic segmentation, the result of visualization will be really weird.

facebookresearch / Mask2Former

Question about semantic segmentation. #179