Britefury / cutmix-semisup-seg

Semi-supervised semantic segmentation needs strong, varied perturbations
MIT License
165 stars 23 forks source link

What is the "mask_arr" in the code? #3

Closed ZHKKKe closed 4 years ago

ZHKKKe commented 4 years ago

Hi. Thanks for your interesting work!

I have a question about the following code: https://github.com/Britefury/cutmix-semisup-seg/blob/44e81b3ae862d2da7d1c4df77fb274f8f1f0a861/datapipe/seg_data.py#L90-L96 What is the "mask_arr" here? All elements of this array are set to 255. Why should we define it here?

In the file of the main experiment, "mask_arr" is converted to "mask" and called by: https://github.com/Britefury/cutmix-semisup-seg/blob/44e81b3ae862d2da7d1c4df77fb274f8f1f0a861/train_seg_semisup_mask_mt.py#L295

https://github.com/Britefury/cutmix-semisup-seg/blob/44e81b3ae862d2da7d1c4df77fb274f8f1f0a861/train_seg_semisup_mask_mt.py#L297

https://github.com/Britefury/cutmix-semisup-seg/blob/44e81b3ae862d2da7d1c4df77fb274f8f1f0a861/train_seg_semisup_mask_mt.py#L306

https://github.com/Britefury/cutmix-semisup-seg/blob/44e81b3ae862d2da7d1c4df77fb274f8f1f0a861/train_seg_semisup_mask_mt.py#L324

In the above code, a "loss_mask" is generated for unlabeled loss, i.e., the consistency loss. Are all elements of "loss_mask" equal to 1? Can you explain what it does?

Thanks in advance.

Britefury commented 4 years ago

Sure. The mask identifies valid pixels for which consistency loss should be computed. All the pixels within the source image are valid, hence the have a value of 255 (or 1 when converted to float format). When we apply some augmentation transformations such as rotation, some of the resulting image crop used for training may come from points that lie outside the bounds of the source image. Note that in datapipe/seg_transforms_cv.py we apply use OpenCV warpAffine to warp the mask, using a value of 0 for pixels outside the source image: https://github.com/Britefury/cutmix-semisup-seg/blob/44e81b3ae862d2da7d1c4df77fb274f8f1f0a861/datapipe/seg_transforms_cv.py#L372

We then use this mask to prevent consistency loss from applying to pixels in the crop used for training that came from outside the source image, as shown in the lines that you quoted above.

ZHKKKe commented 4 years ago

Got it. Big Thanks.