Closed cdpath closed 5 months ago
Kinda busy at work recently. Will update soon.
@cdpath hi! do you have any updates?
@hogepodge please keep tracking this PR.
@cdpath we're trying to make the process for merging community feature requests easier. One thing that would help me a lot in moving this forward would be what we call "acceptance criteria." Essentially, when we hand this off to QA to determine if we can merge it, what is the expected behavior that we can test?
This is a much-requested feature, and we're very grateful for the patch. I want to help us move this along as best as I can.
@hogepodge Sorry to be late. Just did a little update as a walk-around if pycocotools is not available.
:exclamation: No coverage uploaded for pull request base (
master@fc5eb78
). Click here to learn what that means. Patch has no changes to coverable lines.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
@cdpath Our QA team tried to setup pycocotools on window and got this problem:
What are your thoughts here? Any ideas? Maybe we can add pycocotools as options? (something like this pip install label-studio-converter[pycocotools]
)
@cdpath Our QA team tried to setup pycocotools on window and got this problem:
What are your thoughts here? Any ideas? Maybe we can add pycocotools as options? (something like this
pip install label-studio-converter[pycocotools]
)
Yeah, that's an option. Another approach may be: create another fork of pycocotools, which includes wheels for Windows
Do you know how to make it as extra package in pip? We have no bandwidth to support forks of pycocotools.
Any updates on this feature?
@makseq I've added an extra package, but am not certain whether I've correctly updated the _get_supported_formats
I've looked through the patch, and assuming that we've resolved the windows issue by making it an optional install, I'd like to move forward with merging this.
Hello,
I'm using the ml backend with SAM integration and I need to export to COCO with brushlabels, rectangleLabels and keypointLabels as well. So I used your PR code and did some little change. I only changed the files converter.py and brush.py
This is working well for me. There's still a problem when an annotation have multiple label for instance when the user label some stuff with brushLabels and with PolygonLabels in the same time...
Here's my code:
...
def convert_to_coco(
self, input_data, output_dir, output_image_dir=None, is_dir=True
):
def add_image(images, width, height, image_id, image_path):
images.append(
{
'width': width,
'height': height,
'id': image_id,
'file_name': image_path,
}
)
return images
self._check_format(Format.COCO)
ensure_dir(output_dir)
output_file = os.path.join(output_dir, 'result.json')
if output_image_dir is not None:
ensure_dir(output_image_dir)
else:
output_image_dir = os.path.join(output_dir, 'images')
os.makedirs(output_image_dir, exist_ok=True)
images, categories, annotations = [], [], []
categories, category_name_to_id = self._get_labels()
data_key = self._data_keys[0]
item_iterator = (
self.iter_from_dir(input_data)
if is_dir
else self.iter_from_json_file(input_data)
)
for item_idx, item in enumerate(item_iterator):
image_path = item['input'][data_key]
image_id = len(images)
width = None
height = None
# download all images of the dataset, including the ones without annotations
if not os.path.exists(image_path):
try:
image_path = download(
image_path,
output_image_dir,
project_dir=self.project_dir,
return_relative_path=True,
upload_dir=self.upload_dir,
download_resources=self.download_resources,
)
except:
logger.info(
'Unable to download {image_path}. The image of {item} will be skipped'.format(
image_path=image_path, item=item
),
exc_info=True,
)
# add image to final images list
try:
with Image.open(os.path.join(output_dir, image_path)) as img:
width, height = img.size
images = add_image(images, width, height, image_id, image_path)
except:
logger.info(
"Unable to open {image_path}, can't extract width and height for COCO export".format(
image_path=image_path, item=item
),
exc_info=True,
)
# skip tasks without annotations
if not item['output']:
# image wasn't load and there are no labels
if not width:
images = add_image(images, width, height, image_id, image_path)
logger.warning('No annotations found for item #' + str(item_idx))
continue
# concatenate results over all tag names
labels = []
for key in item['output']:
labels += item['output'][key]
if len(labels) == 0:
logger.debug(f'Empty bboxes for {item["output"]}')
continue
for label in labels:
category_name = None
for key in [
'rectanglelabels',
'polygonlabels',
'brushlabels',
'keypointlabels',
'labels',
]:
if key in label and len(label[key]) > 0:
category_name = label[key][0]
break
if category_name is None:
logger.warning("Unknown label type or labels are empty")
continue
if not height or not width:
if 'original_width' not in label or 'original_height' not in label:
logger.debug(
f'original_width or original_height not found in {image_path}'
)
continue
width, height = label['original_width'], label['original_height']
images = add_image(images, width, height, image_id, image_path)
category_id = category_name_to_id[category_name]
annotation_id = len(annotations)
if "polygonlabels" in label:
if "points" not in label:
logger.warn(label)
points_abs = [
(x / 100 * width, y / 100 * height) for x, y in label["points"]
]
x, y = zip(*points_abs)
annotations.append(
{
'id': annotation_id,
'image_id': image_id,
'category_id': category_id,
'segmentation': [
[coord for point in points_abs for coord in point]
],
'bbox': get_polygon_bounding_box(x, y),
'ignore': 0,
'iscrowd': 0,
'area': get_polygon_area(x, y),
}
)
elif 'brushlabels' in label and brush.pycocotools_imported:
if "rle" not in label:
logger.warn(label)
coco_rle = brush.ls_rle_to_coco_rle(label["rle"], height, width)
segmentation = brush.ls_rle_to_polygon(label["rle"], height, width)
bbox = brush.get_cocomask_bounding_box(coco_rle)
area = brush.get_cocomask_area(coco_rle)
annotations.append(
{
"id": annotation_id,
"image_id": image_id,
"category_id": category_id,
"segmentation": segmentation,
"bbox": bbox,
'ignore': 0,
"iscrowd": 0,
"area": area,
}
)
elif 'rectanglelabels' in label or 'keypointlabels' in label:
if "rle" not in label:
logger.warn(label)
coco_rle = brush.ls_rle_to_coco_rle(label["rle"], height, width)
segmentation = brush.ls_rle_to_polygon(label["rle"], height, width)
bbox = brush.get_cocomask_bounding_box(coco_rle)
area = brush.get_cocomask_area(coco_rle)
annotations.append(
{
'id': annotation_id,
'image_id': image_id,
'category_id': category_id,
'segmentation': segmentation,
'bbox': bbox,
'ignore': 0,
'iscrowd': 0,
'area': area,
}
)
elif 'keypointlabels' in label:
if "rle" not in label:
logger.warn(label)
print(label["rle"])
coco_rle = brush.ls_rle_to_coco_rle(label["rle"], height, width)
segmentation = brush.ls_rle_to_polygon(label["rle"], height, width)
bbox = brush.get_cocomask_bounding_box(coco_rle)
area = brush.get_cocomask_area(coco_rle)
annotations.append(
{
'id': annotation_id,
'image_id': image_id,
'category_id': category_id,
'segmentation': segmentation,
'bbox': bbox,
'ignore': 0,
'iscrowd': 0,
'area': area,
}
)
else:
raise ValueError("Unknown label type")
if os.getenv('LABEL_STUDIO_FORCE_ANNOTATOR_EXPORT'):
annotations[-1].update({'annotator': get_annotator(item)})
with io.open(output_file, mode='w', encoding='utf8') as fout:
json.dump(
{
'images': images,
'categories': categories,
'annotations': annotations,
'info': {
'year': datetime.now().year,
'version': '1.0',
'description': '',
'contributor': 'Label Studio',
'url': '',
'date_created': str(datetime.now()),
},
},
fout,
indent=2,
)
...
...
def ls_rle_to_coco_rle(ls_rle, height, width):
"""from LS rle to compressed coco rle"""
ls_mask = decode_rle(ls_rle)
ls_mask = np.reshape(ls_mask, [height, width, 4])[:, :, 3]
ls_mask = np.where(ls_mask > 0, 1, 0)
binary_mask = np.asfortranarray(ls_mask)
coco_rle = binary_mask_to_rle(binary_mask)
result = pycocotools.mask.frPyObjects(coco_rle, *coco_rle.get('size'))
result["counts"] = result["counts"].decode()
return result
def ls_rle_to_polygon(ls_rle, height, width):
"""from LS rle to polygons"""
ls_mask = decode_rle(ls_rle)
ls_mask = np.reshape(ls_mask, [height, width, 4])[:, :, 3]
ls_mask = np.where(ls_mask > 0, 1, 0)
# Find contours from the binary mask
contours = measure.find_contours(ls_mask, 0.5)
segmentation = []
for contour in contours:
# Flip dimensions then ravel and cast to list
contour = np.flip(contour, axis=1)
contour = contour.ravel().tolist()
segmentation.append(contour)
return segmentation
...
There is still the issue when an annotation have multiple labels... There is no way to find them using the filter section.
Therefore I used:
Filter -> annotationResults contains {label}
to find problematic annotations...
@hogepodge let's try to take into account the last comment: https://github.com/heartexlabs/label-studio-converter/pull/175#issuecomment-1614720231
let's talk with @nehalecky on how we can add this changes and deliver this PR eventually.
Any updates on this feature?
After careful consideration, we’ve determined that this is more of an improvement than a critical bug. Additionally, it seems to be an outdated request and hasn’t garnered much interest from the community. For these reasons, we will be closing this issue. We will continue developing the converter library as a part of Label Studio SDK.
We appreciate your understanding and encourage you to submit your feedback, questions and suggestions here: https://github.com/HumanSignal/label-studio-sdk/issues
@cdpath I wanted to check in on this and see if you've had a chance to work on the patch update.