how object segmentation are written in dataset json files?

dexter1608 commented 6 years ago

hi, I have polygons made using labelme annotation tool which are written like this points": [ [ 258.69565217391306, 346.0869565217391 ], [ 252.60869565217394, 345.21739130434776 ], [ 245.6521739130435, 340.0 ], [ 243.04347826086956, 337.3913043478261 ], [ 239.56521739130437, 336.52173913043475 ], [ 232.60869565217394, 332.17391304347825 ], [ 231.73913043478262, 332.17391304347825 ], [ 222.17391304347825, 328.695652173913 ], [ 216.08695652173913, 320.86956521739125 ], [ 209.1304347826087, 319.13043478260863 ], [ 205.6521739130435, 318.2608695652174 ], [ 197.82608695652175, 315.6521739130435 ], [ 185.6521739130435, 315.6521739130435 ], [ 162.17391304347825, 308.695652173913 ]]

and in coco 2014 dataset the segmentations are written like this

{"segmentation": [[239.97,260.24,222.04,270.49,199.84,253.41,213.5,227.79,259.62,200.46,274.13,202.17,277.55,210.71,249.37,253.41,237.41,264.51,242.54,261.95,228.87,271.34]]

i dont know the difference how can I convert my polygons into coco type segmentation

thankyou in advance

nightrome commented 6 years ago

Just convert your own polygon representation to a binary mask (one per polygon) and then convert the mask to the COCO polygon format. E.g. the following script converts .png files (with non-overlapping polygons) https://github.com/nightrome/cocoapi/blob/master/PythonAPI/cocostuff/pngToCocoResultDemo.py

priyanka-chaudhary commented 6 years ago

@nightrome : I am using my own dataset and I have annotated the images. My groundtruth is an image of same size and for every pixel I have a number which is the class ID.

Like for Person class my ground truth image has pixel colour (1,1,1) same as COCO dataset. My question is if there are two person in an image should both be annotated with colour (1,1,1) or is there a different rule? As to show two instances of an object we need some kind of distinction. Do you know how they are being annotated in the MS COCO dataset?

Any help would be really appreciated as I am not able to find this information anywhere. Thank you.

nightrome commented 6 years ago

That doesn't work. You cannot write instances (e.g. two persons) into an image and preserve that information. Take a look at the json format and then write your own script to create it: http://cocodataset.org/#download

priyanka-chaudhary commented 6 years ago

@nightrome I have checked the instances_val2017.json file. For example in image_id: 1000, there are multiple person. And in the file we have more than one person. Here is the part:

{"segmentation": [[413.03,131.72,...,132.36]],"area": 1067.9535499999993,"iscrowd": 0,"image_id": 1000,"bbox": [405.93,120.42,37.13,45.52],"category_id": 1,"id": 1245349}

{"segmentation": [[277.59,392.37,...,380.01]],"area": 14372.617749999994,"iscrowd": 0,"image_id": 1000,"bbox": [265.33,95.86,88.92,315.88],"category_id": 1,"id": 1259139}

{"segmentation": [[281.11,418.74,...,423.72]],"area": 11511.253450000006,"iscrowd": 0,"image_id": 1000,"bbox": [209.23,174.64,99.63,249.08],"category_id": 1,"id": 1269164}

Then my question is how did they get this from pixel-wise annotated image? Because if you assign different colour like (1,1,1) (2,2,2) (3,3,3) for different person in the image then in json format how do we reach to same category_id?

And I need this as I want to use some categories of MS COCO dataset and add few of my own for my own dataset. So I want to have same annotation format.

nightrome commented 6 years ago

The trick is to convert one object instance at a time from your format into a binary map and then into COCO polygon format. Also note that you don't need to make up an encoding into 3 rgb channels. You can just write images with a single channel.

dexter1608 commented 6 years ago

@nightrome how to convert polygons into binary mask? Thank you

nightrome commented 6 years ago

https://github.com/nightrome/cocoapi/blob/master/PythonAPI/pycocotools/cocostuffhelper.py#L19

dexter1608 commented 6 years ago

@priyanka-chaudhary hi how did make this type of "segmentation":[[ .....]]?

{"segmentation": [[413.03,131.72,...,132.36]],"area": 1067.9535499999993,"iscrowd": 0,"image_id": 1000,"bbox": [405.93,120.42,37.13,45.52],"category_id": 1,"id": 1245349}

priyanka-chaudhary commented 6 years ago

@dexter1608 : I am trying to do that from masks. Not figured out yet.

priyanka-chaudhary commented 6 years ago

@nightrome : You mentioned one object instance at a time from your format into a binary map and then into COCO polygon format. I have extracted binary map but how to get COCO polygon format from that?

As when I passed binary mask to segmentationToCocoMask() I get this output:

[{'image_id': 9, 'category_id': 91, 'segmentation': {'counts': b'ga0:f8=dGC\\7\\1VMj20000M30000000000000000001O0000000000001O0000000O100001O00000000000000000000000000000000000000000000000001O000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000O11O00000NaKWK_4i4200000000001O1O0000000000000O1000000000000nKcKU3]4gLhKX3X4bLQL[3W5I5K3[MeIS2]6eMiI[2d60000000000000000000000000000000000O10000000000000000000000001O1O00000O100000000000000000000000000000000000000lM`M\\M`2d2dMXM\\2h2gMUMY2k2jMRMV2n2nMmLS2S3QNbLAjN^2d4bNcKS2]4nM^KW2a4Z100000000000000000000001O000000000000000000O1000000000000000000000000000000000000000000O0100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000O100000000000000000000000001N10000000000000000000000000000000000000000000000O1000O10000000000000000000000000000000000000000000000000000000000000000N2000000H;Gd0]L^JX2g6TN^>', 'size': [300, 416]}}]

nightrome commented 6 years ago

Looks good to me! From the COCO page:

COCO provides segmentation masks for every object instance. This creates two challenges: storing masks compactly and performing mask computations efficiently. We solve both challenges using a custom Run Length Encoding (RLE) scheme. The size of the RLE representation is proportional to the number of boundaries pixels of a mask and operations such as area, union, or intersection can be computed efficiently directly on the RLE. Specifically, assuming fairly simple shapes, the RLE representation is O(√n) where n is number of pixels in the object, and common computations are likewise O(√n). Naively computing the same operations on the decoded masks (stored as an array) would be O(n).

The MASK API provides an interface for manipulating masks stored in RLE format. The API is defined below, for additional details see: MaskApi.m, mask.py, or MaskApi.lua. Finally, we note that a majority of ground truth masks are stored as polygons (which are quite compact), these polygons are converted to RLE when needed.

The RLE encoded polygons can be used (more or less) interchangeably with the more readable type of polygon annotation above.

priyanka-chaudhary commented 6 years ago

@nightrome : So can I use this above result to append to the instances_train2017.json and/or instances_val2017.json file(s) and use for training/validation? Thanks a lot for all your help.

nightrome commented 6 years ago

Yes, we do that for the COCO-Stuff annotations, but don't forget the other fields (area, iscrowd, bbox, id).

priyanka-chaudhary commented 6 years ago

@nightrome : Yes I got the area, bbox from mask.py using encoded segmentation but I don't understand how id is generated. Is there a process for that too?

nightrome commented 6 years ago

AFAIK crowd sourced annotations have a big id and regular annotations a small id. But it should not matter as long as there are no duplicates.

priyanka-chaudhary commented 6 years ago

@nightrome : Any way to find out till where the ids are used or any documentation of it? as they are quite huge files not sure how to find that information. I am not using crowd sourced annotations

nightrome commented 6 years ago

Generally speaking the search function is your friend. The codebase is not really big. I think it is only used as a unique index.

dexter1608 commented 6 years ago

@nightrome the link you've given , I am failing to understand which function to use for polygon to binary mask?

priyanka-chaudhary commented 6 years ago

@nightrome : Thank you for the support. I found a way to do it.

nightrome commented 6 years ago

@dexter1608 Oh I assumed that you would already have binary masks. I don't think such a function exists here. You could look for an external function like https://de.mathworks.com/help/images/ref/poly2mask.html Otherwise try to figure out how COCO polygons work. From the comments in mask.py:

#  poly    - Polygon stored as [[x1 y1 x2 y2...],[x1 y1 ...],...] (2D list)
# Both poly and bbs are 0-indexed (bbox=[0 0 1 1] encloses first pixel).

raninbowlalala commented 6 years ago

@priyanka-chaudhary How did you get the area, bbox from mask.py?

priyanka-chaudhary commented 6 years ago

@raninbowlalala : There are functions defined already in mask.py to get bbox and area from segmentation.

From here

raninbowlalala commented 6 years ago

@priyanka-chaudhary Thank you very much. A further question, how did you generate the id? Which number should be selected to be the first number?

priyanka-chaudhary commented 6 years ago

@raninbowlalala : I extracted all the ids from train and validation files and sorted them. The highest value I found for id is: 2232119. You can take any number after that.

raninbowlalala commented 6 years ago

@priyanka-chaudhary Thanks for your help!

raninbowlalala commented 6 years ago

@nightrome I used COCO-Stuff annotations to convert my own dataset to coco format, and added "area","bbox","id" to json file. When I finetuning the End-to-End_X-101-64x4d-FPN.pkl on my own dataset, I got the error like below: E0227 16:22:45.168256 15153 pybind_state.h:422] Exception encountered running PythonOp function: ValueError: could not convert string to float: c

At: /home/liufang/detectron_venv/detectron/lib/utils/segms.py(134): polys_to_boxes /home/liufang/detectron_venv/detectron/lib/roi_data/mask_rcnn.py(46): add_mask_rcnn_blobs /home/liufang/detectron_venv/detectron/lib/roi_data/fast_rcnn.py(207): _sample_rois /home/liufang/detectron_venv/detectron/lib/roi_data/fast_rcnn.py(112): add_fast_rcnn_blobs /home/liufang/detectron_venv/detectron/lib/ops/collect_and_distribute_fpn_rpn_proposals.py(60): forward terminate called after throwing an instance of 'caffe2::EnforceNotMet' what(): [enforce fail at pybind_state.h:423] . Exception encountered running PythonOp function: ValueError: could not convert string to float: c

It seems like the segmentation format is polygons, but my dataset used RLE format. Do you have this problem?

nightrome commented 6 years ago

Sorry, but I am not familiar with that code. I suggest you contact the authors or look at how they parse the .json file. Or you save it as raw polygons instead of RLE format polygons.

raninbowlalala commented 6 years ago

@nightrome OK, thanks for your help.

raninbowlalala commented 6 years ago

@nightrome Do you know how to convert mask binary image to raw polygons? Or convert RLE format to polygons format? Thanks a lot.

nightrome commented 6 years ago

Unfotunately not, I have never worked with polygons :-(

raninbowlalala commented 6 years ago

I found an issue and helped me convert mask binary image to polygon format. Hope this answer will help others people. https://github.com/facebookresearch/Detectron/issues/100

lscelory commented 6 years ago

@priyanka-chaudhary I got the same output just like you after run segmentationToCocoMask(). I don't know what that means and how can I convert that string to polygon format. Do you have a solution? Look forward to your reply! Thanks a lot!

nightrome commented 6 years ago

@lscelory: The segmentationToCocoMask function is not part of this repository, but actually of the COCO-Stuff repository. Could you repost your question there with more information on what you are trying to achieve? I am losing track of the conversation here.

priyanka-chaudhary commented 6 years ago

@lscelory : You can use it directly. You don't need to convert it to polygon format. Atleast I used it like that only. So I am not aware of any method to convert it to polygon format.

waspinator commented 6 years ago

@nightrome do you know where I can find details on COCO's

custom Run Length Encoding (RLE) scheme

, or how to convert a mask to an uncompressed RLE representation that COCO would work with? I need it to create .json files with iscrowd=1 annotations.

nightrome commented 6 years ago

@waspinator: Not sure why you would create an uncompressed RLE. For compressed RLE have a look at the COCO-Stuff repository. If you have any more questions please open a ticket there.

waspinator commented 6 years ago

@nightrome for example when you download annotations_trainval2017.zip from http://cocodataset.org/#download, all the "iscrowd": 1 annotations are recorded as uncompressed RLE. I want to create my own COCO style dataset, and need to maintain the original style.

{
    "segmentation": {
        "counts": [
            272,2,4,4,4,4,2,9,1,2,16,43,143,24,5,8,16,44,141,25,8,5,17,44,140,26,10,2,17,45,129,4,5,27,24,5,1,45,127,38,23,52,125,40,22,53,123,43,20,54,122,46,18,54,121,54,12,53,119,57,11,53,117,59,13,51,117,59,13,51,117,60,11,52,117,60,10,52,118,60,9,53,118,61,8,52,119,62,7,52,119,64,1,2,2,51,120,120,120,101,139,98,142,96,144,93,147,90,150,87,153,85,155,82,158,76,164,66,174,61,179,57,183,54,186,52,188,49,191,47,193,21,8,16,195,20,13,8,199,18,222,17,223,16,224,16,224,15,225,15,225,15,225,15,225,15,225,15,225,15,225,15,225,15,225,14,226,14,226,14,39,1,186,14,39,3,184,14,39,4,183,13,40,6,181,14,39,7,180,14,39,9,178,14,39,10,177,14,39,11,176,14,38,14,174,14,36,19,171,15,33,32,160,16,30,35,159,18,26,38,158,19,23,41,157,20,19,45,156,21,15,48,156,22,10,53,155,23,9,54,154,23,8,55,154,24,7,56,153,24,6,57,153,25,5,57,153,25,5,58,152,25,4,59,152,26,3,59,152,26,3,59,152,27,1,60,152,27,1,60,152,86,154,80,160,79,161,42,8,29,161,41,11,22,2,3,161,40,13,18,5,3,161,40,15,2,5,8,7,2,161,40,24,6,170,35,30,4,171,34,206,34,41,1,164,34,39,3,164,34,37,5,164,34,35,10,161,36,1,3,28,17,155,41,27,16,156,41,26,17,156,41,26,16,157,27,4,10,25,16,158,27,6,8,11,2,12,6,2,7,159,27,7,14,3,4,19,6,160,26,8,22,18,5,161,26,8,22,18,4,162,26,8,23,15,4,164,23,11,23,11,7,165,19,17,22,9,6,167,19,22,18,8,3,170,18,25,16,7,1,173,17,28,15,180,17,30,12,181,16,34,6,184,15,225,14,226,13,227,12,228,11,229,10,230,9,231,9,231,9,231,9,231,8,232,8,232,8,232,8,232,8,232,8,232,7,233,7,233,7,233,7,233,8,232,8,232,8,232,9,231,9,231,9,231,10,230,10,230,11,229,13,227,14,226,16,224,17,223,19,221,23,217,31,3,5,201,39,201,39,201,39,201,39,201,39,201,40,200,40,200,41,199,41,199,41,199,22,8,12,198,22,12,8,198,22,14,6,198,22,15,6,197,22,16,5,197,22,17,5,196,22,18,4,196,22,19,4,195,22,19,5,194,22,20,4,194,25,21,1,193,27,213,29,211,30,210,35,6,6,193,49,191,50,190,50,190,51,189,51,189,52,188,53,187,53,187,54,186,54,186,54,186,55,185,55,185,55,185,55,185,55,185,55,185,55,185,55,185,55,185,55,185,55,185,55,185,55,185,55,185,55,185,28,1,26,185,23,11,21,185,20,17,17,186,18,21,15,186,16,23,14,187,14,25,14,187,14,26,12,188,14,28,10,188,14,226,14,226,16,224,17,223,19,221,20,220,22,218,24,18,3,12,3,180,25,10,1,4,6,10,6,178,28,7,12,8,8,177,49,3,12,176,65,175,67,173,69,171,53,3,14,170,37,20,9,4,1,169,36,21,8,175,35,22,7,176,34,23,7,176,34,23,6,177,35,22,6,177,35,22,8,175,35,23,9,173,35,205,36,204,39,201,43,197,48,36,1,155,48,35,3,154,49,33,5,154,48,32,6,155,49,27,10,155,51,24,11,154,54,21,11,155,56,19,11,155,56,18,11,156,56,17,11,157,56,16,12,157,56,14,13,159,56,12,13,160,61,5,14,162,78,165,75,167,73,168,72,170,70,171,69,173,67,176,64,179,61,182,58,183,57,185,54,187,53,188,51,191,49,192,47,195,45,196,43,198,42,199,40,201,38,203,36,205,34,207,32,210,28,213,26,216,22,221,16,228,8,10250
        ],
        "size": [
            240,
            320
        ]
    }
}

nightrome commented 6 years ago

@waspinator: I am not sure how to write them, but they can definitely be read using the standard mask API. So imho there is no reason to store your annotations uncompressed. Just take whatever you have and compress it. Try to play around with these functions, that's all we have from Python.

ambigus9 commented 6 years ago

@nightrome and @dexter1608 How to convert my own polygon representation to a binary mask? or Segmentation format? Thanks!

waspinator commented 6 years ago

I wrote a library and article to help with creating COCO style datasets.

https://patrickwasp.com/create-your-own-coco-style-dataset/

vsd550 commented 6 years ago

Hello everyone I am trying to perform object segmentation on images having very dense objects. The dataset consists of images and the bounding boxes around the objects present. But the bounding boxes are not in the format (x,y,w,h), instead all four coordinates are provided i.e. (x1,y1,x2,y2,x3,y3,x4,y4). Can anyone suggest, how can I generate the segmentation mask from it as well as how can I convert the dataset in the coco json file format for training the mask rcnn. @dexter1608 @nightrome @priyanka-chaudhary @raninbowlalala

Thanks in advance

waspinator commented 6 years ago

@vsd550 did you try the reading the link above you? It doesn't work for your exact use case, but you should be able to write a converter for your format using it as a guide. For example you could use http://scikit-image.org/docs/dev/api/skimage.draw.html#skimage.draw.polygon to convert your polygons to masks, and then use pycococreator as-is to generate the coco-style dataset.

vsd550 commented 6 years ago

https://github.com/nightrome/cocoapi/blob/master/PythonAPI/pycocotools/cocostuffhelper.py#L19 In the above link as provided by @nightrome , what are the parameters to pass in the function segmentationToCocoMask(labelMap, labelId). I have created a binary mask of one object instance and now want the segmentation mask in the coco format. @priyanka-chaudhary

vsd550 commented 6 years ago

@waspinator thanks for the answer. After I generate the binary mask

I am using the function

def polygons_to_mask(img_shape, polygons): mask = np.zeros(img_shape[:2], dtype=np.uint8) mask = Image.fromarray(mask) xy = list(map(tuple, polygons)) ImageDraw.Draw(mask).polygon(xy=xy, outline=1, fill=1) mask = np.array(mask, dtype=bool) return mask

This function creates a binary mask given a polygon coordinates, one at a time. Now how can I proceed to create the png and use your pycococreator tool Thanks

nirandiw commented 5 years ago

What is the type of your input polygons?

abhigoku10 commented 4 years ago

@nightrome @waspinator hey guys , i am having a strange issue i.e i get mask mAP as 0 and bounding box mAP correct values . Here is the step i have performed

i annotated a single image file using Labelme 2.Converted the annotation to coco format
Used this image as the GT 4.Used a MaskRCNN on this image to obtain the output and calculate mAP by passing the annotations as GT
so Ideally since i am using the same image i should be getting mAP of box and mask as 100
but i am getting box as 100 and mask as 0

shubham-dpai commented 3 years ago

I wrote a library and article to help with creating COCO style datasets.

https://patrickwasp.com/create-your-own-coco-style-dataset/

Thanks a lot @waspinator !!! It helped me in annotation separation for crowded and non-crowded regions as well as clean decoding of masked data.

LvJC commented 3 years ago

A mask from png image's last(alpha) channel is a soft mask, which has good quality in edges and I think this is good for training. How to convert a soft mask into coco format label? @nightrome @raninbowlalala @priyanka-chaudhary

Dicko87 commented 3 years ago

Hi Guys, so I am wanting to do instance segmentation and the model that I am using expects the data to be in COCO 2017 format. When looking at the COCO 2017 json files for segmentation, I have noticed that the segmentations in the json file are written as bounding box coordinates and not polygon coordinates. I was expecting polygon coordinates that would match up to points that create the bitmap images. I guess the mask images are enough information, coupled with the bounding box that encompasses the mask, for the model to understand what is going on ?

sanghamitrajohri commented 2 years ago

How to convert the masks to polygon format? @nightrome @ priyanka-chaudhary I have polygon annotated the file by labelme and got individual JSON files with respect to its image. And now I want to convert all the JSON files into the COCO Format.. like given here in the link https://github.com/XinzeLee/PolygonObjectDetection/tree/main/coco20/annotations. So, is the repo valid for this one?

cocodataset / cocoapi

how object segmentation are written in dataset json files? #111