KuangJuiHsu / DeepCO3

[CVPR19] DeepCO3: Deep Instance Co-segmentation by Co-peak Search and Co-saliency (Oral paper)
137 stars 15 forks source link

Number of images required to obtain results? #2

Closed Harry-Zhi closed 4 years ago

Harry-Zhi commented 5 years ago

Hi,

Great work and very impressive results. I have three problems related to the paper and hope you could help me solve this:

  1. Based on your experience, how many images of a certain class are required to train DeepCO3 to converge well? In your provided dataset, the number of images of cows, sheep and horses are more than that of trains, air-plane, bus. Correspondingly, results on categories with more images seem better than others. (I suppose we train different DeepCO3 for different classes instead of mixing them.)

  2. If i understand correctly, the whole pipeline of DeepCO3 is fully un-supervised, isn't it?

  3. You mentioned in Sec. 3.3 you used MCG for object proposals which is unsupervised. However, MCG seems to use BSD dataset to get a good setup for parameters. Do you think whether this minimum amount of supervised learning affects the definition of "unsupervised" for MCG, or DeepCO3?

Thanks.

KuangJuiHsu commented 5 years ago

Thanks for your interest.

  1. Your assumption is right, and I train the model for each category. In my experience, 50 images is enough, but more images are better.

  2. Yes, you are right, but some people and reviewers think it is weakly supervised because the inputs must be images containing the same category. Therefore, in this paper, we don't mention our method is unsupervised.

  3. Yes, you are right, and this should be a typo. I just want to compare fairly the proposed method with PRM, so I use MCG. Besides, MCG doesn't use the instance-aware mask for the parameter search, so we think using MCG is okay for this task.

Harry-Zhi commented 5 years ago

Thanks for your interest.

  1. Your assumption is right, and I train the model for each category. In my experience, 50 images is enough, but more images are better.
  2. Yes, you are right, but some people and reviewers think it is weakly supervised because the inputs must be images containing the same category. Therefore, in this paper, we don't mention our method is unsupervised.
  3. Yes, you are right, and this should be a typo. I just want to compare fairly the proposed method with PRM, so I use MCG. Besides, MCG doesn't use the instance-aware mask for the parameter search, so we think using MCG is okay for this task.

Thank you very much for the detailed information.

Is it correct to summarise that all the "training" process in DeepCO3 is designed to find the peaks without the need of image class label ( like in RPM), which makes DeepCO3 more generalisable? After the peaks are found, the latter procedure are similar in the sense of peak BP and MCG object proposals.

I want to ask one extra question if you do not mind: The peaks can be seen as a localisation of potential instances, however, the final instance segmentation quality also heavily relies on the quality of MCG object proposals (i.e., shape of masks). I am wondering if the localisation is good but the proposal is bad, any ideas to avoid the limitation caused by MCG?

KuangJuiHsu commented 5 years ago
  1. Yes. The goal of this paper is to find the peak among the given image set, and I follow the same procedure of the instance mask generation as PRM.

  2. You are right, and this problem is also asked by the reviewers. The selected MCG proposals can be used as a pseudo training data to learn a Mask R-CNN or other instance segmentation algorithms. The results can be automatically generated but maybe also depended on the initialization. However, I think this way can reduce the effect of the poor proposals.

Harry-Zhi commented 5 years ago
  1. Yes. The goal of this paper is to find the peak among the given image set, and I follow the same procedure of the instance mask generation as PRM.
  2. You are right, and this problem is also asked by the reviewers. The selected MCG proposals can be used as a pseudo training data to learn a Mask R-CNN or other instance segmentation algorithms. The results can be automatically generated but maybe also depended on the initialization. However, I think this way can reduce the effect of the poor proposals.

Thank you for the detailed reply!