mohitzsh / Adversarial-Semisupervised-Semantic-Segmentation

Pytorch Implementation of "Adversarial Learning For Semi-Supervised Semantic Segmentation" for ICLR 2018 Reproducibility Challenge
156 stars 45 forks source link

What does the confidence map mean? #4

Closed John1231983 closed 6 years ago

John1231983 commented 6 years ago

This is not an issue. Just discussion about some key idea of the paper that you are implementing. As the traditional GAN, the discriminator tries to distinguish the input as real or fake value. It works as the classification problem with two labels: fake and real, thus the last layer will be a connected layer. Meanwhile, the proposed method replace the last layer by 1x1 convolution that provides spatial maps (likes FCN)- they called fully convolutional layer.

The key difference as I think that is the input of proposed discriminator. With inputs are ground-truth and score map (provides by generator), the proposed discriminator results in a confidence maps-which indicates how much different between ground-truth and score map. But I did not understand how can they obtain confidence maps? Can I obtain confidence map if I feed two inputs ground-truth and score map to FCN network (normal case of FCN are ground-truth and raw image->output is score map)? Thanks so much

In that paper mentioned, "After obtaining the initial segmentation prediction of the unlabeled image from the segmentation network, we obtain a confidence map by passing the segmentation prediction through the discriminator network.". It means confidence map only provided when we use the unlabeled image. Could we obtain confidence map using images that have labels?

mohitzsh commented 6 years ago

I think the confidence map simply means "how likely is it that the discriminator thinks that its input is a ground truth segmentation mask". The way I have implemented the discriminator is that it produces two-channel output (of the same spatial dimensions as the input), which you can interpret (after applying a softmax operation) as the probability of input being ground truth segmentation mask(channel 1) or segmentation mask produced by the generator (channel 0).

The input to the discriminator is a spatial probability map over C(C = 21 for PASCALVOC) classes (tensor P of size CxHxW). Define this way, the discriminator network either takes probability map from segmentation network or just the ground truth mask converted to a spatial probability map using one-hot encoding.

John1231983 commented 6 years ago

Thanks for your explanation! In the paper also uses unlabelled data (semi-supervised) that has no ground-truth. So, how they can obtain the confidence map in the unlabelled data case? As my understanding, they first forward the raw image (of the unlabelled dataset) to segmentation network to produce the probability map. Then, the probability map feeds/forward to the discriminator network (that have trained with label dataset) to generate the confident map. Am I right? If I am right, how they use confident map of the unlabelled dataset for training?

mohitzsh commented 6 years ago

The discriminator can generate a confidence map for any probability map (either produced by the segmentation network or directly from the ground truth segmentation mask). For unlabelled data, this probability map is taken from the segmentation network (which doesn't require a ground truth segmentation mask). In my understanding, the confidence map for unlabelled data is used as follows: 1> Generate a probability map and pixel-wise predictions using segmentation network. 2> Use the probability map to generate the confidence map using discriminator. 3> Based on the threshold (T_semi in the paper), define the L_semi using prediction from the segmentation network and confidence map from the discriminator network (see equation (5)).

John1231983 commented 6 years ago

Thanks for your answer. It is very useful when I read the paper. I have downloaded your code, could you please give me/other people the script to run your training file? Because it has many parameters need to insert

mohitzsh commented 6 years ago

I would soon write a more detailed README section to run the training/inference code. In the meanwhile, you can run the python train_base train101 /path/to/dataset --mode base for baseline training python train_base train101 /path/to/dataset --mode adv for adversarial training python train_base train101 /path/to/dataset --mode semi for semi-supervised training.

I would like to mention that this repo is still under development and things may break. I will post an update when things are more stable.

hereForStudy commented 2 years ago

Thanks for the discussion, it helped me a lot