dbolya / yolact

A simple, fully convolutional model for real-time instance segmentation.
MIT License
5.02k stars 1.32k forks source link

Output COCO-style JSON Annotation with decoded segmentation format #759

Open takwaaa opened 2 years ago

takwaaa commented 2 years ago

Hi @dbolya, Thanks for such a useful model. I used the inference script eval.py to generate images with drawn masks and bboxes on single images ,folders, and with a custom test dataset. This worked fine. then after getting the images, i wanted to programmatically use the inference results, so my objective is to export with the generated images a coco- style json file of the annotation results (bounding box et la segmentation).

I checked the issue #230 and I could use the dataset parameter as suggested , but now i have 2 questions:

After running the script eval.py as follows :

python eval.py --trained_model=weights/yolact_plus_resnet50_generated_dwg_3599_360000.pth --config yolact_resnet50_generated_dwg_config --output_coco_json --score_threshold=0.1 --top_k=30 --dataset=generated_Test_datasetfile

as an output i got two files under results/ :

  1. Is this the correct output ? isn't there a way that i could find all the resulted annotations of the tested dataset in one single file rather than being divided in Two files ? ( all the annotations results (bbox and segmentation) would be written in the empty annotation array of the annotations.json file of the custom test dataset ) ?

  2. I am getting the segmentation in encoded format:

    segmentation RLE format (encoded)

how can i decode it to get the segmentation in a polygon format instead ?

Thanks,

iwaitu commented 2 years ago

I have the same question .

takwaaa commented 2 years ago

I have the same question .

Hello, did you find a solution ?

yilifan commented 2 years ago

Hello, did you find a solution ?

IvanGarcia7 commented 2 years ago

I searched in eval.py where the mask is encoded. These two lines are the following:

rle = pycocotools.mask.encode(np.asfortranarray(segmentation.astype(np.uint8))) rle['counts'] = rle['counts'].decode('ascii') # json.dump doesn't like bytes strings

being 'segmentation' the numpy aarray that contains the mask. If you want to generate a json with COCO Format, you will need to do the following:

ground_truth_binary_mask = segmentation.astype(np.uint8) fortran_ground_truth_binary_mask = np.asfortranarray(ground_truth_binary_mask) encoded_ground_truth = mask.encode(fortran_ground_truth_binary_mask) ground_truth_area = mask.area(encoded_ground_truth) ground_truth_bounding_box = mask.toBbox(encoded_ground_truth) contours = measure.find_contours(ground_truth_binary_mask, 0.5)

annotation = { "segmentation": [], "area": ground_truth_area.tolist(), "iscrowd": 0, "image_id": im_id, "bbox": ground_truth_bounding_box.tolist(), "category_id": cat_id, "id": id }

for contour in contours: contour = np.flip(contour, axis=1) segmentation = contour.ravel().tolist() annotation["segmentation"].append(segmentation)

print(json.dumps(annotation, indent=4))

I have not tested it but will do it in the following days. I hope that this answer will be helpful.

hafizur-r commented 2 years ago

I searched in eval.py where the mask is encoded. These two lines are the following:

rle = pycocotools.mask.encode(np.asfortranarray(segmentation.astype(np.uint8))) rle['counts'] = rle['counts'].decode('ascii') # json.dump doesn't like bytes strings

being 'segmentation' the numpy aarray that contains the mask. If you want to generate a json with COCO Format, you will need to do the following:

ground_truth_binary_mask = segmentation.astype(np.uint8) fortran_ground_truth_binary_mask = np.asfortranarray(ground_truth_binary_mask) encoded_ground_truth = mask.encode(fortran_ground_truth_binary_mask) ground_truth_area = mask.area(encoded_ground_truth) ground_truth_bounding_box = mask.toBbox(encoded_ground_truth) contours = measure.find_contours(ground_truth_binary_mask, 0.5)

annotation = { "segmentation": [], "area": ground_truth_area.tolist(), "iscrowd": 0, "image_id": im_id, "bbox": ground_truth_bounding_box.tolist(), "category_id": cat_id, "id": id }

for contour in contours: contour = np.flip(contour, axis=1) segmentation = contour.ravel().tolist() annotation["segmentation"].append(segmentation)

print(json.dumps(annotation, indent=4))

I have not tested it but will do it in the following days. I hope that this answer will be helpful.

Hi, I am new here. I am trying to understand your solution. Could you please help me? Do we need to modify these two lines with your provided code? Where should we use your code? Thanks in advance!

rle = pycocotools.mask.encode(np.asfortranarray(segmentation.astype(np.uint8))) 
rle['counts'] = rle['counts'].decode('ascii') # json.dump doesn't like bytes strings
BinZhou-23 commented 1 year ago

你好@dbolya, 感谢您提供如此有用的模型。 我使用推理脚本 eval.py 在单个图像、文件夹和自定义测试数据集上生成带有绘制蒙版和 bbox 的图像。这很好用。然后在获得图像后,我想以编程方式使用推理结果,所以我的目标是使用生成的图像导出注释结果的 coco-style json 文件(边界框等 la 分割)。

我检查了问题#230,我可以按照建议使用数据集参数,但现在我有 2 个问题:

运行脚本 eval.py 后如下:

python eval.py --trained_model=weights/yolact_plus_resnet50_generated_dwg_3599_360000.pth --config yolact_resnet50_generated_dwg_config --output_coco_json --score_threshold=0.1 --top_k=30 --dataset=generated_Test_datasetfile

作为输出,我在 results/ 下得到了两个文件:

  • mask_detections.json
  • bbox_detections.json
  1. 这是正确的输出吗?有没有一种方法可以在一个文件中找到测试数据集的所有结果注释,而不是分成两个文件?(所有注释结果(bbox 和分割)将写入自定义测试数据集的 annotations.json 文件的空注释数组中)?
  2. 我得到编码格式的分段:
分段 RLE 格式(编码)

我如何解码它以获取多边形格式的分割?

谢谢,

hi, how did you create your dataset? i have no idea about this.