NVlabs / SPADE

Semantic Image Synthesis with SPADE
https://nvlabs.github.io/SPADE/
Other
7.59k stars 982 forks source link

Coco staff dataset and inst maps #48

Closed RBTlove11 closed 5 years ago

RBTlove11 commented 5 years ago

Hi, after I read the code and the paper carefully. I still have three questions to confuse me, would anybody please so kind to guide me?

  1. I think Coco stuff is just that the 92 categories of stuff instead of 182 categories to be used in stuff task, just as described in my blog: https://blog.csdn.net/Scarlett_Guan/article/details/89916692. Maybe the author does not know Coco staff datasets accurately?Or if my comprehension is not accurate, please tell me. When I read the code, the annotation only contains the instances_train2017.json, which is a annotation for instance segmentation and only contains 80 categories of thing. When I use coco API to print the categories, it can be proved. So on earth instance segmentation or stuff task? 92 categories or 80 categories or 182 categories?

  2. I think the isnt map is invalid. there’s no need to use it, only label map it enough. Through the script coco_generate_instance_map.py, I think isnt map is almost the same with the label map.

  3. Why label adding one in SPADE/util/coco.py. it is seems and quite unnecessary.

Thank you so much!

taesungp commented 5 years ago
  1. The COCO dataset was originally created with 92 labels, which only contained foreground objects. At this point, there was no label for the background. Later on, Caesar et al., augmented the dataset by adding more classes to the dataset and also labeling the background. This new dataset is called COCO-stuff. We used this new dataset, which is different from COCO dataset.
  2. Instance map is not necessary, but it may help generating better images. There is instance map only for the foreground object, and it's only meaningful when two objects of the same class overlap. So it is true that in most cases, the boundary between different labels provide enough signal.
  3. It's just to account for the difference between 0-index and 1-index.
RBTlove11 commented 5 years ago

Thank you so much!

  1. I misunderstood about the Coco-stuff dataset before. (When I read the code, the annotation only contains the instances_train2017.json, which is a annotation for instance segmentation and only contains 80 categories of thing. ) This is because I read the script: coco_generate_instance_map.py, the annotation only contains the instances_train2017.json. Now I know the script is only used for generating instance map, which only for the foreground object, so it only contains 80 categories of thing.

  2. Your explanation is very clear. the script coco_generate_instance_map.py is only used to draw the boundary of two objects of the same class overlap, only for the foreground object.

  3. I’m still confused that I only saw 0-index in the Coco official website and other websites, while I haven’t seen 1-index before.

RBTlove11 commented 5 years ago

Besides, since you use the 182 categories containing both foreground objects and background objects, you also use the script coco_generate_instance_map.py to generate instance map for the two objects of the same classes overlap, for the foreground objects. Then why don’t you panoptic task of coco dataset directly? I think this task can contains all the functions above.

taesungp commented 5 years ago

I wouldn't worry too much about 0-indexing and 1-indexing stuff. It's just for easier coding across many datasets. I think some datasets were 1-indexed, but eventually to convert it to 1-hot vector, it should be changed to 0-indexed.

What do you mean by the panoptic task of coco dataset directly? I didn't understand what you mean.

RBTlove11 commented 5 years ago

@taesungp you read the Coco dataset carefully on the official website, you will find that:

  1. instance segmentation instance

  2. semantic segmentation yuyi

  3. panoptic segmentation quanjing

When you see the second picture and the third picture carefully. You will find that the third picture can distinguish the objects of the same class? And maybe the third picture is just what you want?

RBTlove11 commented 5 years ago

@taesungp Why can instance map help generating better images? In comparison to the label map in the script, the "isnt map" didn’t offer any additional information. the "label map" is like this: 微信图片_20190517113523

the output "isnt map" just draw the boundary of the persons and padding them. While the boundary of the persons exits originally in the label map.

So I’m so confused that I guess: maybe your comprehension about the label map is not accurate. Maybe in your opinion the "label map" is like this: 微信图片_20190517114028 And the "isnt map" can draw the boundary of the persons.

But we can get semantic segmentation from instance segmentation, while We can’t get instance segmentation from the semantic segmentation, because we don’t have enough information.

If my comprehension is not accurate, please tell me with no hesitation!

RBTlove11 commented 5 years ago

I wouldn't worry too much about 0-indexing and 1-indexing stuff. It's just for easier coding across many datasets. I think some datasets were 1-indexed, but eventually to convert it to 1-hot vector, it should be changed to 0-indexed.

What do you mean by the panoptic task of coco dataset directly? I didn't understand what you mean.

In my opinion, there are mainly three tasks in image segmentation just as I said in my blog: https://blog.csdn.net/Scarlett_Guan/article/details/89918328

semantic segmentation: the simplest, classify for every pixel (foreground objects and background objects)in the picture, while if there are three persons in the picture, it can’t distinguish them. instance segmentation: only foreground objects. If there are three persons in the picture, it can distinguish them. panoptic segmentation: the combination and upgrades of semantic segmentation and instance segmentation. It classify for every pixel (foreground objects and background objects) in the picture. If there are three persons in the picture, it can distinguish them. I think finally you use panoptic label in fact. And the Coco-staff dataset has panoptic annotation. So maybe you can only use the panoptic label directly?