openvinotoolkit / datumaro

Dataset Management Framework, a Python library and a CLI tool to build, analyze and manage Computer Vision datasets.
https://openvinotoolkit.github.io/datumaro/
MIT License
542 stars 135 forks source link

Convert class and instance binary masks to COCO instances format #858

Closed maritum closed 1 year ago

maritum commented 1 year ago

Hi everyone!

I am wondering if there is a way to convert dataset to coco-instances format by using class and instance masks.

common_semantic_segmentation format accepts only single mask, so this makes difficult to distinguish instances from the same class on the image. Is there a way to provide an instance mask with the dictionary {instance_id :class_id} or two binary masks for dataset conversion?

Thank you!

vinnamkim commented 1 year ago

Hi @maritum,

Thanks for your interests on our project!

Sorry for inconvenience but there is no high-level API for this functionality. Instead, I created a simple Jupyter-notebook example to solve your problem using Datumaro. Please refer to this: https://github.com/vinnamkim/datumaro/blob/cs/make-coco-instance-mask/notebooks/08_assign_label.ipynb

p.s. The visualization part of the notebook will be working after #860 is merged.

maritum commented 1 year ago

Hi @vinnamkim! Thank you again for your help. It works great!

I am curious if you have any recommendations for memory optimization when using dm.Dataset.from_iterable() for large datasets. I am currently processing it by chinks and saving a temporary dataset which I then merge.Is there built-in functionality for this in datamaro?

Thank you in advance!

vinnamkim commented 1 year ago

Hi @maritum,

Sorry, I don't get your point about memory optimization. Could you give me more details? I guess if you built your dataset with dm.Dataset.from_iterable(), the main bottleneck for the memory would be Image or Mask. If you give a raw image data to create Image (e.g. Image(data=np.array(...))), it is recommended to create it with providing an image file path, e.g., Image(path=<path/to/image>) to save the memory. On the other hand, Mask can be compressed by run-Length encoding (RLE) as follows.

import numpy as np
import pycocotools.mask as mask_utils
import datumaro as dm

binary_mask = np.array(
    [
        [1, 1, 1],
        [1, 1, 0],
        [1, 0, 0],
    ], dtype=np.uint8, requirements="F"
)

rle_binary_mask = mask_tools.encode(binary_mask)

# You can use dm.RleMask as the compressed version of dm.Mask
mask = dm.RleMask(
    id=0,
    group=0,
    image=rle_binary_mask,
    label=0
)

This example shows how to use RleMask rather than using Mask.

vinnamkim commented 1 year ago

Because there has been no response for long time, I'll close this ticket.