cvat-ai / cvat

Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.
https://cvat.ai
MIT License
12.39k stars 2.97k forks source link

API Import Annotations as Mask Shape - Polygon bug (Please help!!) #7618

Closed ejboomus closed 6 months ago

ejboomus commented 7 months ago

Actions before raising this issue

Is your feature request related to a problem? Please describe.

I am doing semantic segmentation annotation uploads through the API. There are two problems with this and neither of these problems exist when the annotations are uploaded through the GUI as masks instead of polygons.

1) There are gaps between labels, leading to not every pixel having a label. This can also be seen on the bitmap. As well, the polygon masks do not extend to 3/4 edges of the picture. 2) When uploaded through the API, the polygon masks do not have the "edit" feature so they cannot easily be fixed with brush labels

Describe the solution you'd like

I would like the ability to be able to upload annotations as masks through the API (or to know how to do this if it is possible somehow!!!)

Describe alternatives you've considered

No response

Additional context

Bitmap of uploaded annotations. There should be no gaps in the annotations, and these gaps do not exist when they are uploaded as masks. You can see the edge of the image is also not covered.

image

zhiltsov-max commented 6 months ago

Hi! If by "uploading" via API you mean that you're uploading annotations in some specific format, then the UI doesn't do anything special here, all the processing is done by the server.

  1. There are gaps between labels, leading to not every pixel having a label. This can also be seen on the bitmap. As well, the polygon masks do not extend to 3/4 edges of the picture.

Basically, it's a natural property of polygons. Unlike masks, polygons can't guarantee to span specific pixels, they are projected onto the image pixels by approximation.

  1. When uploaded through the API, the polygon masks do not have the "edit" feature so they cannot easily be fixed with brush labels

I'm not sure it's totally clear what you mean here. Which "edit" feature you're speaking about? Annotations, both masks and polygons, can be edited in UI after they're uploaded in CVAT, the same way as you'd do this if the annotations were imported via the UI.

You can convert polygons to masks and vice-versa, if needed. One way is to use the conversion feature in the UI, and another option is to do this with a tool like Datumaro before the annotations are uploaded into CVAT.

ejboomus commented 6 months ago

@zhiltsov-max Thank you for the help!

Apologies if what I posted was not clear, I am wondering if there is a way to upload annotations directly as masks without using the UI? Too much information is lost when it is uploaded as polygons. Thanks again!

zhiltsov-max commented 6 months ago

Sure, you can upload them in any of the formats supporting masks (CVAT for images/video, Datumaro, COCO, PASCAL VOC, Segmentation Mask, CamVid, ...) or directly as the UI does. You'll need to encode masks as tight bbox RLE. Then you can upload them as in this test.

ejboomus commented 6 months ago

Posting this in case someone needs it like I did. The context is trying to upload semantic segmentation annotations to CVAT using the api.

RLE appears to change depending on what platform it is (ex: CVAT and COCO RLE are slightly different). For the "points" section of the json, the value is cvat_rle['rle'] from below as well as cvat_rle values of left, top, right, and bottom (in that order).

Spent a couple days figuring that one out, hopefully it helps someone.

def binary_image_mask_to_cvat_rle(image: np.ndarray) -> dict:

convert COCO-style whole image mask to CVAT tight object RLE

istrue = np.argwhere(image == 255).transpose()
top = int(istrue[0].min())
left = int(istrue[1].min())
bottom = int(istrue[0].max())
right = int(istrue[1].max())
roi_mask = image[top:bottom + 1, left:right + 1]

# compute RLE values
def reduce_fn(acc, v):
    if v == acc['val']:
        acc['res'][-1] += 1
    else:
        acc['val'] = v
        acc['res'].append(1)
    return acc
roi_rle = reduce(
    reduce_fn,
    roi_mask.flat,
    { 'res': [0], 'val': False }
)['res']

cvat_rle = {
    'rle': roi_rle,
    'top': top,
    'bottom': bottom,
    'left': left,
    'right': right,
    'width': right - left + 1,
    'height': bottom - top + 1,
}

return cvat_rle