rbgirshick / py-faster-rcnn

Faster R-CNN (Python implementation) -- see https://github.com/ShaoqingRen/faster_rcnn for the official MATLAB version
Other
8.1k stars 4.11k forks source link

minival misses annotations from for some images? #522

Open vadimkantorov opened 7 years ago

vadimkantorov commented 7 years ago

It seems the instances_minival2014.json misses annotations for some images, like id 226111:

grep -o '.\{100\}id": 226111' ~/coco/instances_minival2014.json
# CO_val2014_000000226111.jpg", "height": 640, "width": 480, "date_captured": "2013-11-17 01:19:58", "id": 226111
# .02775, "iscrowd": 0, "image_id": 147740, "bbox": [66.84, 167.6, 61.41, 186.69], "category_id": 1, "id": 226111

First line is the image definition, and second is some annotation for image 147740.

A quick script found ~60 images without annotations. Should I use the normal COCO annotations for these images? I could only find captions in captions_val2014.json for this image with id 226111.

Should I just skip these files? Probably missing something very obvious here.

I downloaded minival annotations by link found in README.

Yc174 commented 7 years ago

I want to download the instances_minival2014.json and instances_valminusminival.json from the links given by rbgirshick but they are all failed. Can you upload them to the Google Drive? Thank you. @vadimkantorov

insikk commented 6 years ago

@Yc174 You can download from here. https://github.com/insikk/coco_dataset_trainval35k I also suspect something fishy in minival annotation. It contains duplicate annotation ids. They are

anno id duplicates
{'iscrowd': 1, 'id': 908400416500.0, 'image_id': 416451, 'size': [480, 640]}, 'bbox': [479, 83, 160, 391], 'category_id': 84, 'area': 33103}
overwrite
{'iscrowd': 1, 'id': 908400416500.0, 'image_id': 416534, 'size': [355, 640]}, 'bbox': [271, 169, 112, 87], 'category_id': 84, 'area': 4555}

anno id duplicates
{'iscrowd': 1, 'id': 900100520700.0, 'image_id': 520707,  'size': [480, 640]}, 'bbox': [235, 142, 277, 41], 'category_id': 1, 'area': 843}
overwrite
{'iscrowd': 1, 'id': 900100520700.0, 'image_id': 520659, 'size': [480, 640]}, 'bbox': [113, 49, 526, 340], 'category_id': 1, 'area': 12581}

anno id duplicates
{'iscrowd': 1, 'id': 900100463500.0, 'image_id': 463542, 'size': [426, 640]}, 'bbox': [11, 0, 628, 158], 'category_id': 1, 'area': 16282}
overwrite
{'iscrowd': 1, 'id': 900100463500.0, 'image_id': 463522,  'size': [480, 640]}, 'bbox': [375, 205, 97, 58], 'category_id': 1, 'area': 1836}

anno id duplicates
{'iscrowd': 1, 'id': 900100193200.0, 'image_id': 193245, 'size': [375, 500]}, 'bbox': [358, 187, 55, 27], 'category_id': 1, 'area': 743}
overwrite
{'iscrowd': 1, 'id': 900100193200.0, 'image_id': 193181, 'size': [640, 426]}, 'bbox': [1, 0, 424, 163], 'category_id': 1, 'area': 39259}

anno id duplicates
{'iscrowd': 1, 'id': 900100259600.0, 'image_id': 259571,  'size': [281, 500]}, 'bbox': [62, 55, 307, 88], 'category_id': 1, 'area': 1113}
overwrite
{'iscrowd': 1, 'id': 900100259600.0, 'image_id': 259597,  'size': [312, 640]}, 'bbox': [0, 32, 579, 205], 'category_id': 1, 'area': 28188}
csnemes2 commented 6 years ago

It can be easily fixed from the val2014. In the 3rd case for example: Looks like the second 900100463500 should be 908400416451

ipdb> cocox = COCO('/home/csn/COCO/annotations/instances_val2014.json')
loading annotations into memory...
Done (t=3.43s)
creating index...
index created!
ipdb> cocox.getAnnIds(imgIds=416451)
[93604, 94468, 114096, 445644, 502194, 1111885, 1113479, 1496112, 1498357, 1504164, 1649397, 1656630, 1657898, 1658919, 1661224, 1661329, 1661380, 1662061, 1662157, 1662458, 1662692, 1663080, 1972038, 1974450, 1974795, 1974876, 2143961, 908400416451]
ipdb> cocox.getAnnIds(imgIds=416534)
[19234, 25117, 29326, 102024, 113562, 1650029, 1650625, 1654683, 1658309, 1658698, 1658758, 1660680, 1661532, 1661844, 1662281, 1662814, 1985230, 2142949, 908400416534]
ipdb> coco.getAnnIds(imgIds=416534)
[19234, 25117, 29326, 102024, 113562, 1650029, 1650625, 1654683, 1658309, 1658698, 1658758, 1660680, 1661532, 1661844, 1662281, 1662814, 1985230, 2142949, 908400416500.0]
ipdb> coco.getAnnIds(imgIds=416451)
[93604, 94468, 114096, 445644, 502194, 1111885, 1113479, 1496112, 1498357, 1504164, 1649397, 1656630, 1657898, 1658919, 1661224, 1661329, 1661380, 1662061, 1662157, 1662458, 1662692, 1663080, 1972038, 1974450, 1974795, 1974876, 2143961, 908400416500.0]
csnemes2 commented 6 years ago

I corrected the duplicates (based on val2014), now you can download my corrected minival2014 See readme for the correction process.