hellochick / ICNet-tensorflow

TensorFlow-based implementation of "ICNet for Real-Time Semantic Segmentation on High-Resolution Images".
405 stars 153 forks source link

class number issues #43

Closed ifangcheng closed 6 years ago

ifangcheng commented 6 years ago

in train.py, for CITYSCAPES_DATASET, the NUM_CLASSES is set 19 (in line 29). from line 57 we can see parser.add_argument("--num-classes", type=int, default=NUM_CLASSES, help="Number of classes to predict (including background).") So it seems that the --num-classes should "including background".

However, from tools.py we can see that the label_colours (in line 6) includes 19 classes, without background class. it means if we including background class, the NUM_CLASSES should be 20.

So, any explanations about that? or I miss something?

hellochick commented 6 years ago

Hi @ifangcheng, Sorry for misleading you. In cityscapes dataset I directly ignore the background ( pixel value is 255 ), so the num_classes is 19 ( from 0 to 18 ). Hence, the class number shouldn't include the background, since we can set the ignore label to ignore it, that is, we ignore every pixel that has value of 255. You can take a look at get_mask function ( line 96-102 in train.py ).

ifangcheng commented 6 years ago

@hellochick thanks! got it!

Tamme commented 6 years ago

Is it really the same for ade20k and the like datasets where 0 is the void label? It seems to me that if labels are eg [0: void, 1:road, 2:grass, 3:forest, 4:sky], I need to set NUM_CLASSES=5 and just ignore when it predicts 0. at least this is the only way I get same evaluation mIoU when switching void from 255 to 0.

hellochick commented 6 years ago

@Tamme, when I evaluate for ade20k, I subtract the labels by 1, and ignore label -1. You can try this. Since the classes we predict is from 0-149 and the label is from 0-150 where 0 means background.

Tamme commented 6 years ago

@hellochick thats what I thought was sensible, but this at ~ line 140 in evaluate.py seems contradicting:

if args.dataset == 'ade20k': pred = tf.add(pred, tf.constant(1, dtype=tf.int64)) mIoU, update_op = tf.contrib.metrics.streaming_mean_iou(pred, gt, num_classes=param['num_classes']+1) elif args.dataset == 'cityscapes': mIoU, update_op = tf.contrib.metrics.streaming_mean_iou(pred, gt, num_classes=param['num_classes'])

Or is it the wrong place I'm looking at :)

hellochick commented 6 years ago

@Tamme Oh, it will lead to the same result. In this code, I add our predictions by 1, so as to shift our prediction from [0-149] to [1-150]. And then ignore the class 0 in ground truth which means background. Is this make sense to you?