ultralytics / yolov5

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
49.81k stars 16.13k forks source link

Do python scripts that convert Pascal VOC annotations to YOLOv5 annotations affect the accuracy of the annotations at all? #6300

Closed ib124 closed 2 years ago

ib124 commented 2 years ago

Search before asking

Question

Hi @glenn-jocher, I have more of an abstract question. I'm doing a study comparing various computer vision algorithms, and some of algorithms I am using require Pascal VOC annotations. Therefore, I found the following python script to convert my annotations to a YOLO format:

import glob
import os
import pickle
import xml.etree.ElementTree as ET
from os import listdir, getcwd
from os.path import join

dirs = ['train', 'val']
classes = ['class1', 'class2']

def getImagesInDir(dir_path):
    image_list = []
    for filename in glob.glob(dir_path + '/*.jpg'):
        image_list.append(filename)

    return image_list

def convert(size, box):
    dw = 1./(size[0])
    dh = 1./(size[1])
    x = (box[0] + box[1])/2.0 - 1
    y = (box[2] + box[3])/2.0 - 1
    w = box[1] - box[0]
    h = box[3] - box[2]
    x = x*dw
    w = w*dw
    y = y*dh
    h = h*dh
    return (x,y,w,h)

def convert_annotation(dir_path, output_path, image_path):
    basename = os.path.basename(image_path)
    basename_no_ext = os.path.splitext(basename)[0]

    in_file = open(dir_path + '/' + basename_no_ext + '.xml')
    out_file = open(output_path + basename_no_ext + '.txt', 'w')
    tree = ET.parse(in_file)
    root = tree.getroot()
    size = root.find('size')
    w = int(size.find('width').text)
    h = int(size.find('height').text)

    for obj in root.iter('object'):
        difficult = obj.find('difficult').text
        cls = obj.find('name').text
        if cls not in classes or int(difficult)==1:
            continue
        cls_id = classes.index(cls)
        xmlbox = obj.find('bndbox')
        b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text), float(xmlbox.find('ymax').text))
        bb = convert((w,h), b)
        out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')

cwd = getcwd()

for dir_path in dirs:
    full_dir_path = cwd + '/' + dir_path
    output_path = full_dir_path +'/yolo/'

    if not os.path.exists(output_path):
        os.makedirs(output_path)

    image_paths = getImagesInDir(full_dir_path)
    list_file = open(full_dir_path + '.txt', 'w')

    for image_path in image_paths:
        list_file.write(image_path + '\n')
        convert_annotation(full_dir_path, output_path, image_path)
    list_file.close()

    print("Finished processing: " + dir_path)

After running the script, the YOLO annotations come out looking like this:

0 0.3949652777777778 0.6449652777777778 0.1232638888888889 0.11701388888888889
0 0.5939236111111111 0.7368055555555556 0.02951388888888889 0.03263888888888889
1 0.5270833333333333 0.9144097222222223 0.17152777777777778 0.17048611111111112
1 0.4809027777777778 0.7753472222222223 0.21180555555555555 0.10833333333333334
1 0.5454861111111111 0.6774305555555555 0.061111111111111116 0.05763888888888889
1 0.49739583333333337 0.6444444444444445 0.08784722222222223 0.03194444444444445
1 0.47222222222222227 0.4821180555555556 0.23125 0.21354166666666669
1 0.4574652777777778 0.2701388888888889 0.18506944444444445 0.20625000000000002
0 0.35399305555555555 0.21597222222222223 0.02326388888888889 0.025
0 0.3307291666666667 0.36996527777777777 0.014930555555555556 0.019097222222222224
1 0.14270833333333333 0.02309027777777778 0.17708333333333334 0.04618055555555556
1 0.1467013888888889 0.16597222222222222 0.13020833333333334 0.12361111111111112
1 0.15156250000000002 0.29704861111111114 0.15868055555555557 0.12743055555555555
1 0.18715277777777778 0.4217013888888889 0.18194444444444446 0.16770833333333335
1 0.17083333333333334 0.5854166666666667 0.25416666666666665 0.20277777777777778
1 0.2 0.8451388888888889 0.26319444444444445 0.3090277777777778

Whereas, when labeling images using a tool such as LabelImg, there are not as many decimal places. For example, here is a LabelImg .txt file output:

0 0.587153 0.509549 0.071528 0.067708
0 0.575521 0.731771 0.044792 0.025347
0 0.522917 0.670139 0.047222 0.040972
0 0.522743 0.623611 0.046181 0.036806
0 0.589410 0.579861 0.046875 0.030556
0 0.578125 0.619444 0.026389 0.026389
0 0.565972 0.679340 0.022917 0.024653
0 0.596181 0.666319 0.052083 0.063889
0 0.644097 0.643750 0.035417 0.041667
0 0.688889 0.648090 0.031250 0.029514
0 0.591319 0.805382 0.029861 0.035069
0 0.621528 0.789757 0.023611 0.030903
0 0.611806 0.814931 0.013889 0.016667
0 0.684201 0.759549 0.023958 0.025347
0 0.679688 0.710590 0.014931 0.014931
0 0.753646 0.734722 0.014931 0.020833
0 0.726215 0.691493 0.014931 0.009375
0 0.797569 0.638715 0.014583 0.011458
0 0.768403 0.657813 0.013889 0.014236
0 0.689236 0.559549 0.021528 0.026042
0 0.704340 0.437847 0.017014 0.016667
0 0.846007 0.449306 0.025347 0.013889
0 0.619792 0.422917 0.020139 0.022917
0 0.426910 0.365451 0.030903 0.023264
0 0.435069 0.277604 0.012500 0.020486
0 0.483160 0.383333 0.028125 0.022222

The difference is that when using the python script, more decimals are added to the .txt file. Would this affect the results of the algorithm at all? I would hypothesize that it wouldn't but I just thought I would ask you to be sure.

Thanks!!

Additional

No response

glenn-jocher commented 2 years ago

@ib124 training YOLOv5 on VOC is very simple, dataset is autodownloaded and labels adapted automatically:

python train.py --data VOC.yaml

See VOC.yaml for conversion details: https://github.com/ultralytics/yolov5/blob/a1a9c6884c5cfda4c972f4087ad4d4b9c3da6518/data/VOC.yaml#L1-L80

engrjav commented 2 years ago

@ib124 i am also confused about same. did you learn anything.

@glenn-jocher thank you forthe response. Actually the data is not pascal voc dataset it is in voc format so do you recommend any particular script to convert?

ib124 commented 2 years ago

@ib124 i am also confused about same. did you learn anything.

Hi @engrjav, I found that I had no difficulty with my detections after running that script to convert Pascal to YOLO.

glenn-jocher commented 2 years ago

@engrjav @ib124 conversion script in https://github.com/ultralytics/yolov5/issues/6300#issuecomment-1013444521 is exact to numerical precision to the best of my knowledge (I implemented it myself based on a user's PR). Obviously raise a bug report if you discover any problems.