cvat-ai / cvat

Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.
https://cvat.ai
MIT License
12.4k stars 2.98k forks source link

dataset export segmentation class foreground/background dependent. #3131

Closed seblum closed 10 months ago

seblum commented 3 years ago

My actions before raising this issue

I have two different labels per frames which overlap each other (bottle / liquid). I export the full dataset of a labeled video using the "segmentation mask 1.1". The segmentationClass folder shows the labeled classes. However, they are dependent on which label is in the forground. For example. If I have the labeled bottle in the forground and the liquid in the background, the segmentation class only shows the bottle, as it overlays the liquid. I can the foreground/background per frame, but not overall

Expected Behaviour

It would be good to bring all labels of one type to foreground or background. Or to access this setting via an API.

Current Behaviour

If I have the labeled bottle in the forground and the liquid in the background, the segmentation class only shows the bottle, as it overlays the liquid. I think this is due to the order of when something was labeled.

Possible Solution

Context

I have about 5000 frames so checking everything manually is a hell of a work.

Your Environment

Next steps

You may join our Gitter channel for community support.

nmanovic commented 3 years ago

@seblum , could you please explain your task? CVAT UI allows you to filter data and validate it online. Why do you export data in "segmentation mask" format? Another approach is to export data in datumaro format and use https://github.com/openvinotoolkit/datumaro to generate masks only with one of these objects.

After you explain your task, I will try to recommend a way to solve the problem.

TheoLiu31 commented 3 years ago

@nmanovic, I think I have the same issue.

when I annotate a video with lots of frames, I need to make manually which label to be background or foreground for every frame. If I don't do this, the segmentation mask are not always the same.

00000620 00000625 As you can see from these two masks, the first one shows tree and cars correctly while the second one not because the sidewalks become foreground. When I export my annotations to segmentation mask, the tree and cars are partially erased because of sidewalks.

My question is there a way to set certain labels to background (roads, sidewalks etc..) and others to foreground (persons, cars, etc..) automatically?

Thanks Théo

seblum commented 3 years ago

@nmanovic We can directly use the segmentation mask .png for our model training. The issue is the same as @TheoLiu31 mentioned. Thank you for adding it to the backlog. Is there a current "hack" to resolve this without datumaro?

zhiltsov-max commented 3 years ago

My question is there a way to set certain labels to background (roads, sidewalks etc..) and others to foreground (persons, cars, etc..) automatically?

It is not generally applicable, because, for example, sometimes trees hide pedestrians, and sometimes they are not. But if your task has some general rule, you can try the following approach.

Is there a current "hack" to resolve this without datumaro?

Actually, there is :smile:.

You can export the task in Datumaro, CVAT, VOC or COCO formats. Then, using the API you can use z_order attribute in annotations. The rough code for this is the following:

from datumaro.components.dataset import Dataset
from datumaro.components.extractor import AnnotationType

dataset = Dataset.import_from('path/', format='datumaro' / 'cvat' / 'coco' / 'voc')

label_layers = {
  'background': 0,
  'road': 1,
  'sidewalk': 2,
  'tree': 3,
  # add your list here
}

def label_to_layer(label_id):
  if label_id is None:
    return 0

  label_cat = dataset.categories()[AnnotationType.label]
  label_name = label_cat[label_id].name
  return label_layers[label_name]

for item in dataset:
  for ann in item.annotations:
    if ann.type in {AnnotationType.polygon, AnnotationType.mask, AnnotationType.bbox}:
      ann.z_order = label_to_layer(ann.label)

dataset.transform('polygons_to_masks')

dataset.export('path2/', format='voc_segmentation')