clovaai / wsolevaluation

Evaluating Weakly Supervised Object Localization Methods Right (CVPR 2020)
MIT License
332 stars 55 forks source link

ImagenetV2 has a lot of incorrect Bbox annotation information? #39

Open fengxun2017 opened 3 years ago

fengxun2017 commented 3 years ago

I used the bBox annotation information provided by you for testing, and found many wrong Bboxes. for example: 100/5.jpeg,73,214,148,299 100/5.jpeg,180,85,200,111 100/5.jpeg,169,137,231,195 100/5.jpeg,394,163,462,207 100/5.jpeg,316,134,374,160 100/5.jpeg,136,206,171,235 100/5.jpeg,242,146,267,164

junsukchoe commented 3 years ago

Hello,

Thanks for your interest in our work! I just checked 100/5.jpeg, but I cannot find any wrong box: 100_5

Please check again and let me know if you find anything wrong from the box annotations.

huang422 commented 2 years ago

I cropped all the bbox and also find a lot of incorrect Bbox annotation? val2/2/8.jpeg,173,109,462,224 val2/58/5.jpeg,112,220,349,310 val2/35/6.jpeg,36,38,331,232 val2/36/1.jpeg,72,73,451,277 val2/48/9.jpeg,0,28,420,257 ... ... Thanks

ShaileshSridhar2403 commented 2 years ago

Hi @junsukchoe , I also encounter the same problem, here are the bounding boxes on 100/5.jpeg visualized image

Bounding boxes as provided here are:

38,41,66,68 73,214,148,299 180,85,200,111 169,137,231,195 394,163,462,207 316,134,374,160 136,206,171,235 242,146,267,164

How did you visualize the bboxes in your comment above? Is there a step we are missing?

Here is another example : 101/8.jpeg image

junsukchoe commented 2 years ago

Hi @ShaileshSridhar2403, @fengxun2017!

I just reproduced the above problems in the ImageNetV2 dataset. The cause of the problem is that the actual image size is different from the size mentioned in our 'image_sizes.txt'. For example, the size of 100/5.jpeg (actually it is now 100/1f3075074e2c005496d52faa07089f3cea130dee.jpeg) is (384, 256), while the size listed in images_sizes.txt is (500, 333). Hence, you have to first resize the image to (500, 333) and then draw the bboxes for the correct qualitative samples.

Here is an example code snippet:

import cv2
import matplotlib.pyplot as plt

img = cv2.cvtColor(cv2.imread('100/5.jpeg'), cv2.COLOR_BGR2RGB)
img = cv2.resize(img, dsize=(500,333))
img = cv2.rectangle(img, (38, 41), (66, 68), (255,0,0), 3)
img = cv2.rectangle(img, (73, 214), (148, 299), (255,0,0), 3)
img = cv2.rectangle(img, (180, 85), (200, 111), (255,0,0), 3)
img = cv2.rectangle(img, (169, 137), (231, 195), (255,0,0), 3)
img = cv2.rectangle(img, (394, 163), (462, 207), (255,0,0), 3)
img = cv2.rectangle(img, (316, 134), (374, 160), (255,0,0), 3)
img = cv2.rectangle(img, (136, 206), (171, 235), (255,0,0), 3)
img = cv2.rectangle(img, (242, 146), (267, 164), (255,0,0), 3)

plt.imshow(img)

Then, you will get the following image: result

Unfortunately, it is currently unknown why this discripancy happened. But I would like to note that the evaluation results from our code are still correct since we convert the bboxes to relative coordinates for evaluation using the following function: https://github.com/clovaai/wsolevaluation/blob/e00842f8e9d86588d45f8e3e30c237abb364bba4/evaluation.py#L95

I am sincerly sorry for the late reply. Please let me know if you have any further questions.