Open vadimkantorov opened 7 years ago
I want to download the instances_minival2014.json and instances_valminusminival.json from the links given by rbgirshick but they are all failed. Can you upload them to the Google Drive? Thank you. @vadimkantorov
@Yc174 You can download from here. https://github.com/insikk/coco_dataset_trainval35k I also suspect something fishy in minival annotation. It contains duplicate annotation ids. They are
anno id duplicates
{'iscrowd': 1, 'id': 908400416500.0, 'image_id': 416451, 'size': [480, 640]}, 'bbox': [479, 83, 160, 391], 'category_id': 84, 'area': 33103}
overwrite
{'iscrowd': 1, 'id': 908400416500.0, 'image_id': 416534, 'size': [355, 640]}, 'bbox': [271, 169, 112, 87], 'category_id': 84, 'area': 4555}
anno id duplicates
{'iscrowd': 1, 'id': 900100520700.0, 'image_id': 520707, 'size': [480, 640]}, 'bbox': [235, 142, 277, 41], 'category_id': 1, 'area': 843}
overwrite
{'iscrowd': 1, 'id': 900100520700.0, 'image_id': 520659, 'size': [480, 640]}, 'bbox': [113, 49, 526, 340], 'category_id': 1, 'area': 12581}
anno id duplicates
{'iscrowd': 1, 'id': 900100463500.0, 'image_id': 463542, 'size': [426, 640]}, 'bbox': [11, 0, 628, 158], 'category_id': 1, 'area': 16282}
overwrite
{'iscrowd': 1, 'id': 900100463500.0, 'image_id': 463522, 'size': [480, 640]}, 'bbox': [375, 205, 97, 58], 'category_id': 1, 'area': 1836}
anno id duplicates
{'iscrowd': 1, 'id': 900100193200.0, 'image_id': 193245, 'size': [375, 500]}, 'bbox': [358, 187, 55, 27], 'category_id': 1, 'area': 743}
overwrite
{'iscrowd': 1, 'id': 900100193200.0, 'image_id': 193181, 'size': [640, 426]}, 'bbox': [1, 0, 424, 163], 'category_id': 1, 'area': 39259}
anno id duplicates
{'iscrowd': 1, 'id': 900100259600.0, 'image_id': 259571, 'size': [281, 500]}, 'bbox': [62, 55, 307, 88], 'category_id': 1, 'area': 1113}
overwrite
{'iscrowd': 1, 'id': 900100259600.0, 'image_id': 259597, 'size': [312, 640]}, 'bbox': [0, 32, 579, 205], 'category_id': 1, 'area': 28188}
It can be easily fixed from the val2014. In the 3rd case for example: Looks like the second 900100463500 should be 908400416451
ipdb> cocox = COCO('/home/csn/COCO/annotations/instances_val2014.json')
loading annotations into memory...
Done (t=3.43s)
creating index...
index created!
ipdb> cocox.getAnnIds(imgIds=416451)
[93604, 94468, 114096, 445644, 502194, 1111885, 1113479, 1496112, 1498357, 1504164, 1649397, 1656630, 1657898, 1658919, 1661224, 1661329, 1661380, 1662061, 1662157, 1662458, 1662692, 1663080, 1972038, 1974450, 1974795, 1974876, 2143961, 908400416451]
ipdb> cocox.getAnnIds(imgIds=416534)
[19234, 25117, 29326, 102024, 113562, 1650029, 1650625, 1654683, 1658309, 1658698, 1658758, 1660680, 1661532, 1661844, 1662281, 1662814, 1985230, 2142949, 908400416534]
ipdb> coco.getAnnIds(imgIds=416534)
[19234, 25117, 29326, 102024, 113562, 1650029, 1650625, 1654683, 1658309, 1658698, 1658758, 1660680, 1661532, 1661844, 1662281, 1662814, 1985230, 2142949, 908400416500.0]
ipdb> coco.getAnnIds(imgIds=416451)
[93604, 94468, 114096, 445644, 502194, 1111885, 1113479, 1496112, 1498357, 1504164, 1649397, 1656630, 1657898, 1658919, 1661224, 1661329, 1661380, 1662061, 1662157, 1662458, 1662692, 1663080, 1972038, 1974450, 1974795, 1974876, 2143961, 908400416500.0]
I corrected the duplicates (based on val2014), now you can download my corrected minival2014 See readme for the correction process.
It seems the
instances_minival2014.json
misses annotations for some images, like id226111
:First line is the image definition, and second is some annotation for image 147740.
A quick script found ~60 images without annotations. Should I use the normal COCO annotations for these images? I could only find captions in
captions_val2014.json
for this image with id226111
.Should I just skip these files? Probably missing something very obvious here.
I downloaded
minival
annotations by link found in README.