HumanSignal / label-studio-converter

Tools for converting Label Studio annotations into common dataset formats
https://labelstud.io/
255 stars 132 forks source link

Export of OCR task into COCO and YOLO doesn't work #226

Open lindakasabian opened 1 year ago

lindakasabian commented 1 year ago

I'm trying to export the OCR task into one of the COCO or YOLO formats with polygon region annotation format. The rectangle annotation works fine, but the polygon region (see photo below) doesn't work neither with COCO or YOLO. изображение The issue was originally encountered in label-studio, but the traceback suggests that it's the converter's issue.

Export of polygon region OCR task to COCO Traceback:

Traceback (most recent call last):
  File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\rest_framework\views.py", line 506, in dispatch
    response = handler(request, *args, **kwargs)
  File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\django\utils\decorators.py", line 43, in _wrapper
    return bound_method(*args, **kwargs)
  File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\label_studio\data_export\api.py", line 190, in get
    export_stream, content_type, filename = DataExport.generate_export_file(
  File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\label_studio\data_export\models.py", line 162, in generate_export_file
    converter.convert(input_json, tmp_dir, output_format, is_dir=False)
  File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\label_studio_converter\converter.py", line 212, in convert
    self.convert_to_coco(
  File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\label_studio_converter\converter.py", line 623, in convert_to_coco
    x, y, w, h = self.rotated_rectangle(label)
  File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\label_studio_converter\converter.py", line 860, in rotated_rectangle
    label["x"],
KeyError: 'x'

Export of OCR polygon region task to YOLO

Traceback (most recent call last):
  File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\rest_framework\views.py", line 506, in dispatch
    response = handler(request, *args, **kwargs)
  File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\django\utils\decorators.py", line 43, in _wrapper
    return bound_method(*args, **kwargs)
  File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\label_studio\data_export\api.py", line 190, in get
    export_stream, content_type, filename = DataExport.generate_export_file(
  File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\label_studio\data_export\models.py", line 162, in generate_export_file
    converter.convert(input_json, tmp_dir, output_format, is_dir=False)
  File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\label_studio_converter\converter.py", line 218, in convert
    self.convert_to_yolo(
  File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\label_studio_converter\converter.py", line 815, in convert_to_yolo
    x, y, w, h = self.rotated_rectangle(label)
  File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\label_studio_converter\converter.py", line 860, in rotated_rectangle
    label["x"],
KeyError: 'x'

label studio and converter install info:

{
  "release": "1.8.0",
  "label-studio-os-package": {
    "version": "1.8.0",
    "short_version": "1.8",
    "latest_version_from_pypi": "1.8.0",
    "latest_version_upload_time": "2023-06-05T23:14:45",
    "current_version_is_outdated": false
  },

  "label-studio-os-backend": {
    "message": "fix: LSDV-5235: Use alternate check for postpone based on presence of  ...",
    "commit": "181c997901d5e4ebfd53747b28ab5cb6537910d1",
    "date": "2023/06/05 13:43:26",
    "branch": "",
    "version": "1.8.0+0.g181c997"
  },

  "label-studio-frontend": {
    "message": "fix: LSDV-5235: Use alternate check for postpone based on presence of  ...",
    "commit": "640d531fa241470d7613b53abccade2c7467a94e",
    "branch": "ls-release/1.8.0",
    "date": "2023/06/05 13:36:43"
  },

  "dm2": {
    "message": "fix: LSDV-5192: Change quick view icon and change hover color (#196)",
    "commit": "6175d9dc27547a3a76a5809880f48774810d5bc2",
    "branch": "ls-release/1.8.0",
    "date": "2023/06/02 08:52:09"
  },

  "label-studio-converter": {
    "version": "0.0.53"
  }
}
jzw0025 commented 1 year ago

I got the same issue, and I thought the output format only contains 4 coordinates while the polygon included more, which maybe the issue from. Thanks.

lindakasabian commented 1 year ago

I got the same issue, and I thought the output format only contains 4 coordinates while the polygon included more, which maybe the issue from. Thanks.

my only workaround was to handle OCR label type differently in the converter.py file. then I was able to extract both polygons and bboxes with text field, similar to COCO-text format