matterport / Mask_RCNN

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow
Other
24.55k stars 11.69k forks source link

how to convert your dataset into annoted format. #155

Open dexter1608 opened 6 years ago

dexter1608 commented 6 years ago

hi I want to convert my dataset into annoted format like mscoco

aolansili commented 6 years ago

all you need is labelImg

dexter1608 commented 6 years ago

@aolansili how it is done i mean what is the process without using any external tool?

Liron-2 commented 6 years ago

but it converts it to pascalvoc format and not mscoco... i search for a way to convert it to mscoco format

topcomma commented 6 years ago

@dexter1608 ,I have converted my annotated dataset extending coco categories ,e.g. lane. First you can label polygon with LabelMe tool from MIT, then generated coco json format referring to COCOStuff. There are some tricks when converting coco json format.

Hope above info is useful to you.

dexter1608 commented 6 years ago

@topcomma thanx for sharing i wanted to ask you that as you know if we label images then there will be json files for each image and in ms coco there is only one json file for all images. any suggestions on this?

topcomma commented 6 years ago

You can write python code to convert it json format and append it to coco dataset. You can refer coco stuff python conversion code.

dexter1608 commented 6 years ago

thanx alot pal. @topcomma

dexter1608 commented 6 years ago

@Liron-2 how did you convert your pascal voc format into mscoco format?

dexter1608 commented 6 years ago

@topcomma how to refer json format to cocostuff

priyanka-chaudhary commented 6 years ago

@topcomma I am using my own dataset and I have annotated the images. My groundtruth is an image of same size and for every pixel I have a number which is the class ID.

Like for Person class my ground truth image has pixel colour (1,1,1) same as COCO dataset. My question is if there are two person in an image should both be annotated with colour (1,1,1) or is there a different rule? As to show two instances of an object we need some kind of distinction. Do you know how they are being annotated in the MS COCO dataset?

From this pixel annotated image I want to convert to json format. Thank you!

fastlater commented 6 years ago

@priyanka-chaudhary I was asking myself the same since I used masks like you mentioned for semantic segmentation. However, in instance segmentation, masks are different. In issue #56 , the mentioned that Mask RCNN generates 28x28 float masks. According to my understanding, each class has a different mask. According to the paper, figure 4, the mask shape is [28x28x80] since the 2014 coco release contains segmentation masks for 80 categories. I checked the m variable in visualize.py, m = mask[:, :, np.where(class_ids == class_id)[0]] m = np.sum(m * np.arange(1, m.shape[-1] + 1), -1)

and saved this variable data into a csv file. I got something like the image below. This image shows 4 instances of the triangle class. For instance 1 (triangle 1), each pixels is labeled as 1. Pixels of the second triangle, as 2 and so on. Background pixels are labeled as 0.

mask_triangle

Since recently I have been researching about how the write the json file, I still dont know if these masks need to be saved in a folder or if the polygon info stored inside the annotation section of the json file is enough to define the mask. Let us know if you discover something else. I will be happy to get any extra info about how to prepare the masks. We are trying to understand how to prepare the data in the issue https://github.com/matterport/Mask_RCNN/issues/297

priyanka-chaudhary commented 6 years ago

@fastlater : Please refer to my and nightrome comments in the following issue.

https://github.com/cocodataset/cocoapi/issues/111#issuecomment-369212170

I have converted the groundtruth image to masks and stored in similar format as coco dataset.

waspinator commented 6 years ago

I wrote a library and article to help with creating COCO style datasets.

https://patrickwasp.com/create-your-own-coco-style-dataset/

hanskrupakar commented 6 years ago

I've been working on a GUI-based widget to create annotations similar to the format used by MS COCO here: https://github.com/Deep-Magic/COCO-Style-Dataset-Generator-GUI.

In addition to creating masks for new datasets, one can use a pre-trained Mask RCNN model from this repo to come up with editable predicted masks to try to move towards annotation methods for instance segmentation tasks that hopefully can scale to larger datasets and can be much faster.

slothkong commented 6 years ago

@topcomma, by coincidence I used the same labeling tool as you, and know I'm trying to write a script to convert to COCO json format. Would you mind sharing any source code you used to convert your annotations to COCO?

inders commented 6 years ago

@hanskrupakar how can i create my own person key point data set with this tool. Any pointers/directions would be highly appreciated.

frankdeepl commented 5 years ago

thanx alot pal. @topcomma

Hi @dexter1608, Could you share your solution for us? I tired to find conversion from coco staff, however, its file format if .mat and it is not single json file.

jsbroks commented 5 years ago

I've written tools to help create COCO datasets and convert between different formats

wfeng66 commented 2 years ago

You can write python code to convert it json format and append it to coco dataset. You can refer coco stuff python conversion code.

@dexter1608 ,I have converted my annotated dataset extending coco categories ,e.g. lane. First you can label polygon with LabelMe tool from MIT, then generated coco json format referring to COCOStuff. There are some tricks when converting coco json format.

Hope above info is useful to you.

I am confused by the annotation formats. I see most of you talking to convert data set to coco format. We know the coco format should like: { "info": { "year": "2021", "version": "1.0", "description": "Exported from FiftyOne", "contributor": "Voxel51", "url": "https://fiftyone.ai", "date_created": "2021-01-19T09:48:27" }, "licenses": [ { "url": "http://creativecommons.org/licenses/by-nc-sa/2.0/", "id": 1, "name": "Attribution-NonCommercial-ShareAlike License" }, ...
], "categories": [ ... { "id": 2, "name": "cat", "supercategory": "animal" }, ... ], "images": [ { "id": 0, "license": 1, "file_name": ".", "height": 480, "width": 640, "date_captured": null }, ... ], "annotations": [ { "id": 0, "image_id": 0, "category_id": 2, "bbox": [260, 177, 231, 199], "segmentation": [...], "area": 45969, "iscrowd": 0 }, ... ] }

However, I feed a sample image to the model, I got the result like this: {'class_ids': array([ 1, 1, 1, 1, 1, 11, 1, 1, 3, 1, 27], dtype=int32), 'masks': array([[[False, False, False, ..., False, False, False], [False, False, False, ..., False, False, False], [False, False, False, ..., False, False, False], ..., [False, False, False, ..., False, False, False], [False, False, False, ..., False, False, False], [False, False, False, ..., False, False, False]],

    [[False, False, False, ..., False, False, False],
     [False, False, False, ..., False, False, False],
     [False, False, False, ..., False, False, False],
     ...,
     [False, False, False, ..., False, False, False],
     [False, False, False, ..., False, False, False],
     [False, False, False, ..., False, False, False]],

    [[False, False, False, ..., False, False, False],
     [False, False, False, ..., False, False, False],
     [False, False, False, ..., False, False, False],
     ...,
     [False, False, False, ..., False, False, False],
     [False, False, False, ..., False, False, False],
     [False, False, False, ..., False, False, False]],

    ...,

    [[False, False, False, ..., False, False, False],
     [False, False, False, ..., False, False, False],
     [False, False, False, ..., False, False, False],
     ...,
     [False, False, False, ..., False, False, False],
     [False, False, False, ..., False, False, False],
     [False, False, False, ..., False, False, False]],

    [[False, False, False, ..., False, False, False],
     [False, False, False, ..., False, False, False],
     [False, False, False, ..., False, False, False],
     ...,
     [False, False, False, ..., False, False, False],
     [False, False, False, ..., False, False, False],
     [False, False, False, ..., False, False, False]],

    [[False, False, False, ..., False, False, False],
     [False, False, False, ..., False, False, False],
     [False, False, False, ..., False, False, False],
     ...,
     [False, False, False, ..., False, False, False],
     [False, False, False, ..., False, False, False],
     [False, False, False, ..., False, False, False]]]),

'rois': array([[ 28, 18, 324, 123], [ 60, 113, 321, 188], [ 78, 171, 309, 222], [ 48, 372, 323, 439], [ 72, 241, 298, 320], [213, 253, 300, 308], [ 62, 106, 108, 136], [133, 441, 157, 465], [176, 311, 220, 341], [ 62, 97, 282, 135], [178, 76, 222, 111]], dtype=int32), 'scores': array([0.9996469 , 0.9994461 , 0.99928266, 0.9992362 , 0.9990871 , 0.9971348 , 0.9135687 , 0.90791875, 0.85449725, 0.76860076, 0.7153612 ], dtype=float32)}

Totally different!

Moreover, if you download the balloon sample data set and open the via_region_data.json, you will find other format: { "34020010494_e5cb88e1c4_k.jpg1115004": { "fileref": "", "size": 1115004, "filename": "34020010494_e5cb88e1c4_k.jpg", "base64_img_data": "", "file_attributes": {}, "regions": { "0": { "shape_attributes": { "name": "polygon", "all_points_x": [ 1020, 1000, 994, 1003, 1023, 1050, 1089, 1134, 1190, 1265, 1321, 1361, 1403, 1428, 1442, 1445, 1441, 1427, 1400, 1361, 1316, 1269, 1228, 1198, 1207, 1210, 1190, 1177, 1172, 1174, 1170, 1153, 1127, 1104, 1061, 1032, 1020 ], "all_points_y": [ 963, 899, 841, 787, 738, 700, 663, 638, 621, 619, 643, 672, 720, 765, 800, 860, 896, 942, 990, 1035, 1079, 1112, 1129, 1134, 1144, 1153, 1166, 1166, 1150, 1136, 1129, 1122, 1112, 1084, 1037, 989, 963 ] }, "region_attributes": {} } } },

Would you mind telling me which one is the model need and why they are so different?

Thanks!