Open dexter1608 opened 6 years ago
Just convert your own polygon representation to a binary mask (one per polygon) and then convert the mask to the COCO polygon format. E.g. the following script converts .png files (with non-overlapping polygons) https://github.com/nightrome/cocoapi/blob/master/PythonAPI/cocostuff/pngToCocoResultDemo.py
@nightrome : I am using my own dataset and I have annotated the images. My groundtruth is an image of same size and for every pixel I have a number which is the class ID.
Like for Person class my ground truth image has pixel colour (1,1,1) same as COCO dataset. My question is if there are two person in an image should both be annotated with colour (1,1,1) or is there a different rule? As to show two instances of an object we need some kind of distinction. Do you know how they are being annotated in the MS COCO dataset?
Any help would be really appreciated as I am not able to find this information anywhere. Thank you.
That doesn't work. You cannot write instances (e.g. two persons) into an image and preserve that information. Take a look at the json format and then write your own script to create it: http://cocodataset.org/#download
@nightrome I have checked the instances_val2017.json file. For example in image_id: 1000, there are multiple person. And in the file we have more than one person. Here is the part:
{"segmentation": [[413.03,131.72,...,132.36]],"area": 1067.9535499999993,"iscrowd": 0,"image_id": 1000,"bbox": [405.93,120.42,37.13,45.52],"category_id": 1,"id": 1245349}
{"segmentation": [[277.59,392.37,...,380.01]],"area": 14372.617749999994,"iscrowd": 0,"image_id": 1000,"bbox": [265.33,95.86,88.92,315.88],"category_id": 1,"id": 1259139}
{"segmentation": [[281.11,418.74,...,423.72]],"area": 11511.253450000006,"iscrowd": 0,"image_id": 1000,"bbox": [209.23,174.64,99.63,249.08],"category_id": 1,"id": 1269164}
Then my question is how did they get this from pixel-wise annotated image? Because if you assign different colour like (1,1,1) (2,2,2) (3,3,3) for different person in the image then in json format how do we reach to same category_id?
And I need this as I want to use some categories of MS COCO dataset and add few of my own for my own dataset. So I want to have same annotation format.
The trick is to convert one object instance at a time from your format into a binary map and then into COCO polygon format. Also note that you don't need to make up an encoding into 3 rgb channels. You can just write images with a single channel.
@nightrome how to convert polygons into binary mask? Thank you
@priyanka-chaudhary hi how did make this type of "segmentation":[[ .....]]?
{"segmentation": [[413.03,131.72,...,132.36]],"area": 1067.9535499999993,"iscrowd": 0,"image_id": 1000,"bbox": [405.93,120.42,37.13,45.52],"category_id": 1,"id": 1245349}
@dexter1608 : I am trying to do that from masks. Not figured out yet.
@nightrome : You mentioned one object instance at a time from your format into a binary map and then into COCO polygon format
. I have extracted binary map but how to get COCO polygon format from that?
As when I passed binary mask to segmentationToCocoMask()
I get this output:
[{'image_id': 9, 'category_id': 91, 'segmentation': {'counts': b'ga0:f8=dGC\\7\\1VMj20000M30000000000000000001O0000000000001O0000000O100001O00000000000000000000000000000000000000000000000001O000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000O11O00000NaKWK_4i4200000000001O1O0000000000000O1000000000000nKcKU3]4gLhKX3X4bLQL[3W5I5K3[MeIS2]6eMiI[2d60000000000000000000000000000000000O10000000000000000000000001O1O00000O100000000000000000000000000000000000000lM`M\\M`2d2dMXM\\2h2gMUMY2k2jMRMV2n2nMmLS2S3QNbLAjN^2d4bNcKS2]4nM^KW2a4Z100000000000000000000001O000000000000000000O1000000000000000000000000000000000000000000O0100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000O100000000000000000000000001N10000000000000000000000000000000000000000000000O1000O10000000000000000000000000000000000000000000000000000000000000000N2000000H;Gd0]L^JX2g6TN^>', 'size': [300, 416]}}]
Looks good to me! From the COCO page:
COCO provides segmentation masks for every object instance. This creates two challenges: storing masks compactly and performing mask computations efficiently. We solve both challenges using a custom Run Length Encoding (RLE) scheme. The size of the RLE representation is proportional to the number of boundaries pixels of a mask and operations such as area, union, or intersection can be computed efficiently directly on the RLE. Specifically, assuming fairly simple shapes, the RLE representation is O(√n) where n is number of pixels in the object, and common computations are likewise O(√n). Naively computing the same operations on the decoded masks (stored as an array) would be O(n).
The MASK API provides an interface for manipulating masks stored in RLE format. The API is defined below, for additional details see: MaskApi.m, mask.py, or MaskApi.lua. Finally, we note that a majority of ground truth masks are stored as polygons (which are quite compact), these polygons are converted to RLE when needed.
The RLE encoded polygons can be used (more or less) interchangeably with the more readable type of polygon annotation above.
@nightrome : So can I use this above result to append to the instances_train2017.json
and/or instances_val2017.json
file(s) and use for training/validation?
Thanks a lot for all your help.
Yes, we do that for the COCO-Stuff annotations, but don't forget the other fields (area, iscrowd, bbox, id).
@nightrome : Yes I got the area, bbox from mask.py
using encoded segmentation but I don't understand how id
is generated. Is there a process for that too?
AFAIK crowd sourced annotations have a big id and regular annotations a small id. But it should not matter as long as there are no duplicates.
@nightrome : Any way to find out till where the ids are used or any documentation of it? as they are quite huge files not sure how to find that information. I am not using crowd sourced annotations
Generally speaking the search function is your friend. The codebase is not really big. I think it is only used as a unique index.
@nightrome the link you've given , I am failing to understand which function to use for polygon to binary mask?
@nightrome : Thank you for the support. I found a way to do it.
@dexter1608 Oh I assumed that you would already have binary masks. I don't think such a function exists here. You could look for an external function like https://de.mathworks.com/help/images/ref/poly2mask.html Otherwise try to figure out how COCO polygons work. From the comments in mask.py:
# poly - Polygon stored as [[x1 y1 x2 y2...],[x1 y1 ...],...] (2D list)
# Both poly and bbs are 0-indexed (bbox=[0 0 1 1] encloses first pixel).
@priyanka-chaudhary How did you get the area, bbox from mask.py?
@raninbowlalala : There are functions defined already in mask.py to get bbox and area from segmentation.
From here
@priyanka-chaudhary Thank you very much. A further question, how did you generate the id? Which number should be selected to be the first number?
@raninbowlalala : I extracted all the ids from train and validation files and sorted them. The highest value I found for id
is: 2232119. You can take any number after that.
@priyanka-chaudhary Thanks for your help!
@nightrome I used COCO-Stuff annotations to convert my own dataset to coco format, and added "area","bbox","id" to json file. When I finetuning the End-to-End_X-101-64x4d-FPN.pkl on my own dataset, I got the error like below: E0227 16:22:45.168256 15153 pybind_state.h:422] Exception encountered running PythonOp function: ValueError: could not convert string to float: c
At: /home/liufang/detectron_venv/detectron/lib/utils/segms.py(134): polys_to_boxes /home/liufang/detectron_venv/detectron/lib/roi_data/mask_rcnn.py(46): add_mask_rcnn_blobs /home/liufang/detectron_venv/detectron/lib/roi_data/fast_rcnn.py(207): _sample_rois /home/liufang/detectron_venv/detectron/lib/roi_data/fast_rcnn.py(112): add_fast_rcnn_blobs /home/liufang/detectron_venv/detectron/lib/ops/collect_and_distribute_fpn_rpn_proposals.py(60): forward terminate called after throwing an instance of 'caffe2::EnforceNotMet' what(): [enforce fail at pybind_state.h:423] . Exception encountered running PythonOp function: ValueError: could not convert string to float: c
It seems like the segmentation format is polygons, but my dataset used RLE format. Do you have this problem?
Sorry, but I am not familiar with that code. I suggest you contact the authors or look at how they parse the .json file. Or you save it as raw polygons instead of RLE format polygons.
@nightrome OK, thanks for your help.
@nightrome Do you know how to convert mask binary image to raw polygons? Or convert RLE format to polygons format? Thanks a lot.
Unfotunately not, I have never worked with polygons :-(
I found an issue and helped me convert mask binary image to polygon format. Hope this answer will help others people. https://github.com/facebookresearch/Detectron/issues/100
@priyanka-chaudhary I got the same output just like you after run segmentationToCocoMask()
. I don't know what that means and how can I convert that string to polygon format. Do you have a solution?
Look forward to your reply! Thanks a lot!
@lscelory: The segmentationToCocoMask function is not part of this repository, but actually of the COCO-Stuff repository. Could you repost your question there with more information on what you are trying to achieve? I am losing track of the conversation here.
@lscelory : You can use it directly. You don't need to convert it to polygon format. Atleast I used it like that only. So I am not aware of any method to convert it to polygon format.
@nightrome do you know where I can find details on COCO's
custom Run Length Encoding (RLE) scheme
, or how to convert a mask to an uncompressed RLE representation that COCO would work with? I need it to create .json
files with iscrowd=1
annotations.
@waspinator: Not sure why you would create an uncompressed RLE. For compressed RLE have a look at the COCO-Stuff repository. If you have any more questions please open a ticket there.
@nightrome for example when you download annotations_trainval2017.zip
from http://cocodataset.org/#download, all the "iscrowd": 1
annotations are recorded as uncompressed RLE. I want to create my own COCO style dataset, and need to maintain the original style.
{
"segmentation": {
"counts": [
272,2,4,4,4,4,2,9,1,2,16,43,143,24,5,8,16,44,141,25,8,5,17,44,140,26,10,2,17,45,129,4,5,27,24,5,1,45,127,38,23,52,125,40,22,53,123,43,20,54,122,46,18,54,121,54,12,53,119,57,11,53,117,59,13,51,117,59,13,51,117,60,11,52,117,60,10,52,118,60,9,53,118,61,8,52,119,62,7,52,119,64,1,2,2,51,120,120,120,101,139,98,142,96,144,93,147,90,150,87,153,85,155,82,158,76,164,66,174,61,179,57,183,54,186,52,188,49,191,47,193,21,8,16,195,20,13,8,199,18,222,17,223,16,224,16,224,15,225,15,225,15,225,15,225,15,225,15,225,15,225,15,225,15,225,14,226,14,226,14,39,1,186,14,39,3,184,14,39,4,183,13,40,6,181,14,39,7,180,14,39,9,178,14,39,10,177,14,39,11,176,14,38,14,174,14,36,19,171,15,33,32,160,16,30,35,159,18,26,38,158,19,23,41,157,20,19,45,156,21,15,48,156,22,10,53,155,23,9,54,154,23,8,55,154,24,7,56,153,24,6,57,153,25,5,57,153,25,5,58,152,25,4,59,152,26,3,59,152,26,3,59,152,27,1,60,152,27,1,60,152,86,154,80,160,79,161,42,8,29,161,41,11,22,2,3,161,40,13,18,5,3,161,40,15,2,5,8,7,2,161,40,24,6,170,35,30,4,171,34,206,34,41,1,164,34,39,3,164,34,37,5,164,34,35,10,161,36,1,3,28,17,155,41,27,16,156,41,26,17,156,41,26,16,157,27,4,10,25,16,158,27,6,8,11,2,12,6,2,7,159,27,7,14,3,4,19,6,160,26,8,22,18,5,161,26,8,22,18,4,162,26,8,23,15,4,164,23,11,23,11,7,165,19,17,22,9,6,167,19,22,18,8,3,170,18,25,16,7,1,173,17,28,15,180,17,30,12,181,16,34,6,184,15,225,14,226,13,227,12,228,11,229,10,230,9,231,9,231,9,231,9,231,8,232,8,232,8,232,8,232,8,232,8,232,7,233,7,233,7,233,7,233,8,232,8,232,8,232,9,231,9,231,9,231,10,230,10,230,11,229,13,227,14,226,16,224,17,223,19,221,23,217,31,3,5,201,39,201,39,201,39,201,39,201,39,201,40,200,40,200,41,199,41,199,41,199,22,8,12,198,22,12,8,198,22,14,6,198,22,15,6,197,22,16,5,197,22,17,5,196,22,18,4,196,22,19,4,195,22,19,5,194,22,20,4,194,25,21,1,193,27,213,29,211,30,210,35,6,6,193,49,191,50,190,50,190,51,189,51,189,52,188,53,187,53,187,54,186,54,186,54,186,55,185,55,185,55,185,55,185,55,185,55,185,55,185,55,185,55,185,55,185,55,185,55,185,55,185,55,185,55,185,28,1,26,185,23,11,21,185,20,17,17,186,18,21,15,186,16,23,14,187,14,25,14,187,14,26,12,188,14,28,10,188,14,226,14,226,16,224,17,223,19,221,20,220,22,218,24,18,3,12,3,180,25,10,1,4,6,10,6,178,28,7,12,8,8,177,49,3,12,176,65,175,67,173,69,171,53,3,14,170,37,20,9,4,1,169,36,21,8,175,35,22,7,176,34,23,7,176,34,23,6,177,35,22,6,177,35,22,8,175,35,23,9,173,35,205,36,204,39,201,43,197,48,36,1,155,48,35,3,154,49,33,5,154,48,32,6,155,49,27,10,155,51,24,11,154,54,21,11,155,56,19,11,155,56,18,11,156,56,17,11,157,56,16,12,157,56,14,13,159,56,12,13,160,61,5,14,162,78,165,75,167,73,168,72,170,70,171,69,173,67,176,64,179,61,182,58,183,57,185,54,187,53,188,51,191,49,192,47,195,45,196,43,198,42,199,40,201,38,203,36,205,34,207,32,210,28,213,26,216,22,221,16,228,8,10250
],
"size": [
240,
320
]
}
}
@waspinator: I am not sure how to write them, but they can definitely be read using the standard mask API. So imho there is no reason to store your annotations uncompressed. Just take whatever you have and compress it. Try to play around with these functions, that's all we have from Python.
@nightrome and @dexter1608 How to convert my own polygon representation to a binary mask? or Segmentation format? Thanks!
I wrote a library and article to help with creating COCO style datasets.
Hello everyone I am trying to perform object segmentation on images having very dense objects. The dataset consists of images and the bounding boxes around the objects present. But the bounding boxes are not in the format (x,y,w,h), instead all four coordinates are provided i.e. (x1,y1,x2,y2,x3,y3,x4,y4). Can anyone suggest, how can I generate the segmentation mask from it as well as how can I convert the dataset in the coco json file format for training the mask rcnn. @dexter1608 @nightrome @priyanka-chaudhary @raninbowlalala
Thanks in advance
@vsd550 did you try the reading the link above you? It doesn't work for your exact use case, but you should be able to write a converter for your format using it as a guide. For example you could use http://scikit-image.org/docs/dev/api/skimage.draw.html#skimage.draw.polygon to convert your polygons to masks, and then use pycococreator as-is to generate the coco-style dataset.
https://github.com/nightrome/cocoapi/blob/master/PythonAPI/pycocotools/cocostuffhelper.py#L19 In the above link as provided by @nightrome , what are the parameters to pass in the function segmentationToCocoMask(labelMap, labelId). I have created a binary mask of one object instance and now want the segmentation mask in the coco format. @priyanka-chaudhary
@waspinator thanks for the answer. After I generate the binary mask
I am using the function
def polygons_to_mask(img_shape, polygons): mask = np.zeros(img_shape[:2], dtype=np.uint8) mask = Image.fromarray(mask) xy = list(map(tuple, polygons)) ImageDraw.Draw(mask).polygon(xy=xy, outline=1, fill=1) mask = np.array(mask, dtype=bool) return mask
This function creates a binary mask given a polygon coordinates, one at a time. Now how can I proceed to create the png and use your pycococreator tool Thanks
What is the type of your input polygons?
@nightrome @waspinator hey guys , i am having a strange issue i.e i get mask mAP as 0 and bounding box mAP correct values . Here is the step i have performed
I wrote a library and article to help with creating COCO style datasets.
Thanks a lot @waspinator !!! It helped me in annotation separation for crowded and non-crowded regions as well as clean decoding of masked data.
A mask from png image's last(alpha) channel is a soft mask, which has good quality in edges and I think this is good for training. How to convert a soft mask into coco format label? @nightrome @raninbowlalala @priyanka-chaudhary
Hi Guys, so I am wanting to do instance segmentation and the model that I am using expects the data to be in COCO 2017 format. When looking at the COCO 2017 json files for segmentation, I have noticed that the segmentations in the json file are written as bounding box coordinates and not polygon coordinates. I was expecting polygon coordinates that would match up to points that create the bitmap images. I guess the mask images are enough information, coupled with the bounding box that encompasses the mask, for the model to understand what is going on ?
How to convert the masks to polygon format? @nightrome @ priyanka-chaudhary I have polygon annotated the file by labelme and got individual JSON files with respect to its image. And now I want to convert all the JSON files into the COCO Format.. like given here in the link https://github.com/XinzeLee/PolygonObjectDetection/tree/main/coco20/annotations. So, is the repo valid for this one?
hi, I have polygons made using labelme annotation tool which are written like this points": [ [ 258.69565217391306, 346.0869565217391 ], [ 252.60869565217394, 345.21739130434776 ], [ 245.6521739130435, 340.0 ], [ 243.04347826086956, 337.3913043478261 ], [ 239.56521739130437, 336.52173913043475 ], [ 232.60869565217394, 332.17391304347825 ], [ 231.73913043478262, 332.17391304347825 ], [ 222.17391304347825, 328.695652173913 ], [ 216.08695652173913, 320.86956521739125 ], [ 209.1304347826087, 319.13043478260863 ], [ 205.6521739130435, 318.2608695652174 ], [ 197.82608695652175, 315.6521739130435 ], [ 185.6521739130435, 315.6521739130435 ], [ 162.17391304347825, 308.695652173913 ]]
and in coco 2014 dataset the segmentations are written like this
{"segmentation": [[239.97,260.24,222.04,270.49,199.84,253.41,213.5,227.79,259.62,200.46,274.13,202.17,277.55,210.71,249.37,253.41,237.41,264.51,242.54,261.95,228.87,271.34]]
i dont know the difference how can I convert my polygons into coco type segmentation
thankyou in advance