xyupeng / ContrastiveCrop

[CVPR 2022 Oral] Crafting Better Contrastive Views for Siamese Representation Learning
MIT License
284 stars 27 forks source link

Some questions for the papers #4

Closed Khoa-NT closed 2 years ago

Khoa-NT commented 2 years ago

Thank you for an interesting paper and easy to understand. Can I ask some questions?

1/ I still don't understand what is the class score you mentioned in section 3.4. Can you explain more? I checked in the code but I couldn't find it. Please correct me if I missed it.

2/ It's interesting that the learning rate for training the linear classifier is 10. Do you have any findings on this? or it's a heuristic configuration?

3/ What is the red plot in Section 4.4. Ablation Studies / Semantic-aware Localization

We also make comparison with RandomCrop that does not use localization (i.e., k = 0), and ground truth bounding boxes (the red plot).

Is it another experiment but has been removed in Fig 6.a ?

Thank you

xyupeng commented 2 years ago

Hi, Khoa-NT Thank you for your interest and your questions.

  1. Sorry for the confusion. By class score we mean the class probability after softmax (a real number within (0, 1)). We get the class score by inputting a crop to a standard ResNet50 trained with ImageNet labels. We didn't put it in the code since it is not the main experiment.
  2. The linear classifier learning rate is adapted from MoCo. The linear cls lr is 30.0 in MoCo. We did a little parameter tuning to make it suitable for all models on small datasets.
  3. It's a mistake that we did not remove the latter half sentence. Please ignore that.
Khoa-NT commented 2 years ago

Hi @xyupeng, Thank you for your details and congratulation on the Oral paper.

In 1)

Sorry for the confusion. By class score we mean the class probability after softmax (a real number within (0, 1)). We get the class score by inputting a crop to a standard ResNet50 trained with ImageNet labels. We didn't put it in the code since it is not the main experiment.

If I understand correctly, the class score is the argmax class probability of the prediction (after softmax). Did you check the predicted class, which is corresponding with that class probability, is the same as the GT? I just wonder, if the predicted class was wrong, then maybe the semantic information is not useful.

xyupeng commented 2 years ago

The class score is the probability at the index of the gt class of that crop/image. It's not the argmax index. We use this score as an indicator of how much categorized semantic information the input crop contains.

Khoa-NT commented 2 years ago

Thank you for clarifying. I got it.