Add mapper for OID boxable data

ozendelait / rvc_devkit

Robust Vision Challenge Devkits

http://www.robustvision.net/

MIT License

107 stars 13 forks source link

Add mapper for OID boxable data #8

Closed michaelisc closed 4 years ago

michaelisc commented 4 years ago

I wrote a conversion for Open Images a while ago however it was horribly slow as it opened every image to check the image size. I now updated this to avoid opening every file making the conversion much faster. Code can be found here: https://github.com/bethgelab/openimages2coco

This should provide most of what we need. The only thing still missing is the conversion of the category annotations to the joint labeling space.

akuznetso commented 4 years ago

For Open Images, it is important to include image-level labels into annotations (since annotations are not exhaustive on image-level label level, i.e. if there is no positive or negative label 'Cat' on an image, there still might be an instance of 'Cat' that will not have a box).

michaelisc commented 4 years ago

So every image needs a list of positive and negative category ids?

akuznetso commented 4 years ago

Yes, exactly. Those are in image-level label files (see here: https://storage.googleapis.com/openimages/web/challenge2019_downloads.html). Depending on if you want to take the work to merge those or not, you can just create a separate COCO-style files with image-level labels and let the participants do the merging. Then those converters will also apply for instance segmentation track where also boxes and image-level labels are needed.

michaelisc commented 4 years ago

It makes sense to add them directly to the conversion pipeline. I'll add this to my tool.

michaelisc commented 4 years ago

I updated my tool to include the image level labels: https://github.com/bethgelab/openimages2coco/commit/ec2af5e04be6ef5c3eaff86b8db3d2658c5632d0

Only human verified labels (confidence = 0 or 1) are used. The naming convention follows LVIS.

akuznetso commented 4 years ago

Great, thanks, Claudio.

ozendelait commented 4 years ago

Great work, thanks! I think the image size check is the only thing that slows it down dramatically. @akuznetso / @rodrigob could you add this meta information somewhere so the images don't have to be read for the conversion? Would you mind if we host a meta file with image_id, image_w, image_h in the rvc repo? @michaelisc : I think v6 of boxable has a different file name: https://storage.googleapis.com/openimages/v6/oidv6-train-annotations-bbox.csv

akuznetso commented 4 years ago

@ozendelait @michaelisc We are able to provide image sizes (and script to obtain them if needed). Sent via email.

I think v6 of boxable has a different file name:

You should use challenge files; I updated the download script for oid detection with the correct file names for the groundtruth files and removed test set / files download

ozendelait commented 4 years ago

Perfect, @all thanks; the new conversion is working for boxable!