speedinghzl / DSRG

Weakly-Supervised Semantic Segmentation Network with Deep Seeded Region Growing (CVPR 2018).
MIT License
251 stars 36 forks source link

Understanding the data in `localization-cues.pickle` file #20

Open sreesindhu-sabbineni opened 4 years ago

sreesindhu-sabbineni commented 4 years ago

I am trying to understand the data stored in the pickle file. I see that each key with img_file_cues have different shapes and values from nearly 0 to 40. Can you please help me in understanding these values.

halbielee commented 4 years ago

In the pickle file there are two kinds of keys for the dictionary. One is for image level label and the other is for the pixel level label

Each number attachted ahead of keys means the image index. The order follows the same order of input_list.txt file.

For the first type whose key has "NUM_labels", this has image-level label(s) for the image. If the image has one object, there is only one label between 1 and 20. If there are more than one objects, there are more than one labels.

For the second type whose key has "NUM_cues",this is the case of your question, this has pixel-level labels for the image. The value is a tuple with shape [3, number of pixels which has label] and each element has the meaning of [class, position for H, position for W].

The first one means the class 0\~20 including background. Others has the value between 0\~40 which means the position.

For example, if the cue has the value with shape (3, 4), there are 4 pixels which has labels. And if the value of that cue is [[0, 1, 2, 2], [0, 1, 1, 2], [2, 1, 3, 2]], the value of the pixel will be image[0, 2] = 0 image[1, 1] = 1 image[1, 3] = 2 image[2, 2] = 2.

I use the cue file for the other weakly supervised segmentation network and you can check the usage of it in here https://github.com/halbielee/SEC_pytorch/blob/master/utils/dataset/voc.py in line 73 to line 79

sreesindhu-sabbineni commented 4 years ago

Thank you so much for the clear explanation. I have one more doubt. Can you please point me to the PASCAL VOC augmented dataset JPEG Images. Everywhere I see this below link but that link only contains the ground truth segmentation labels but not the original JPEG Images.

https://www.dropbox.com/s/oeu149j8qtbs1x0/SegmentationClassAug.zip?dl=0

Also, I am assuming that the seed file only contains cues for images of size 41 * 41. Is that correct?

halbielee commented 4 years ago

Pascal VOC augmented dataset JPEG you mentioned is in the official classification dataset (specifically 2012). You can download the JPEG images (training/val set) on below link http://host.robots.ox.ac.uk/pascal/VOC/voc2012/index.html

The seed file contains maps whose shape is 41 x 41 and each element means a corresponding class including background. Note that there is some missing part in the map where is not be sure. For the detail, please see the last answer.