HumanSignal / label-studio-converter

Tools for converting Label Studio annotations into common dataset formats
https://labelstud.io/
262 stars 131 forks source link

Export to COCO, Pascal VOC XML, YOLO does not work #186

Open rafi-fauzan opened 1 year ago

rafi-fauzan commented 1 year ago

Hi, I'm unable to export my annotations from a project contains ~50000 images and annotations (object detection).

Here's the error:

Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/rest_framework/views.py", line 506, in dispatch response = handler(request, *args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/django/utils/decorators.py", line 43, in _wrapper return bound_method(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/label_studio/data_export/api.py", line 183, in get export_stream, content_type, filename = DataExport.generate_export_file( File "/usr/local/lib/python3.10/dist-packages/label_studio/data_export/models.py", line 161, in generate_export_file converter.convert(input_json, tmp_dir, output_format, is_dir=False) File "/usr/local/lib/python3.10/dist-packages/label_studio_converter/converter.py", line 187, in convert self.convert_to_voc(input_data, output_data, output_image_dir=image_dir, is_dir=is_dir) File "/usr/local/lib/python3.10/dist-packages/label_studio_converter/converter.py", line 852, in convert_to_voc x = int(bbox['x'] / 100 * width) TypeError: unsupported operand type(s) for /: 'NoneType' and 'int'

makseq commented 1 year ago

Hi, seems like some of coordinates are corrupted by None values. What LS version do you use?

rafi-fauzan commented 1 year ago

Hi, seems like some of coordinates are corrupted by None values. What LS version do you use?

Thanks for the response, yes it looks like it, is there any way to ignore those corrupted ones and get only the valid coordinates?

rafi-fauzan commented 1 year ago

Hi, seems like some of coordinates are corrupted by None values. What LS version do you use?

Thanks for the response, yes it looks like it, is there any way to ignore those corrupted ones and get only the valid coordinates?

I'm using v1.7.0

makseq commented 1 year ago

You can either

makseq commented 1 year ago

However, I would also try to understand how this happened, what actions did break bboxes?

rafi-fauzan commented 1 year ago

You can either

  • modify label-studio-converter code
  • or try to re-create annotations: 1 export JSON, 2 write simple python script to find broken x coordinate (box['x'] is None), 3 print broken task and annotation ids. 4 open this task in label studio quickview, fix bbox (or remove this annotation), update/save annotation 5 try YOLO/COCO/VOX export again

Will do, thanks for the suggestions!

rafi-fauzan commented 1 year ago

However, I would also try to understand how this happened, what actions did break bboxes?

I would love to know too, but I'm not the one who labels the images, will try to ask the people who did, and maybe I could find something.

seblful commented 1 year ago

I had the same problem with Label Studio 1.7.3. @makseq you were right, it was corrupt image, thank you for the hint. It's simple script to check your json file to find NoneType values for Polygon Labels:

# Read JSON file
import json
f = open('project-1.json')
data = json.load(f)

# Iterating through images
for one_image in data:
    # Iterating through results in one image
    for results in one_image['annotations']:
        # Iterating through annotations
        for annotation in results['result']:
            # Define x and y points
            points = annotation['value']['points']
            # Iterating through each points
            for x, y in points:
                # Check if points is NoneType
                if x == None or y == None:
                    # Print image id
                    print(f"Corrupt image id is {one_image['id']}")              
seblful commented 1 year ago

I can suggest that one polygon label has disappeared after unknown action (maybe error whilst it was writing due to server disconnect, or something related with ml backend) and this label hasn't showed in frontend app, but it resided in json file like an None value.