PaddlePaddle / PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
Apache License 2.0
38.99k stars 7.32k forks source link

自己用paddlelabel标注的数据集,训练检测模型时报错 list index out of range #12032

Open Gpy23 opened 2 weeks ago

Gpy23 commented 2 weeks ago

请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem

Gpy23 commented 2 weeks ago

distance为负,目前还不知道为什么为负

UserWangZz commented 1 week ago

distance为负,目前还不知道为什么为负

目前是存在这个问题,我们会反馈一下,进行问题排查

nilyang commented 1 week ago

我这也报类似错误,由于生成的train.txt和val.text,test.txt 含有空行,下面用这个方法去空行就行了


import os
from shutil import copyfile
def clearBlankLine(file_path:str):
    copyfile(file_path, file_path + '.tmp')
    file2 = open(file_path, 'w', encoding='utf-8') # 要去掉空行的文件
    file1 = open(file_path + '.tmp', 'r', encoding='utf-8') # 生成没有空行的文件
    try:
        for line in file1.readlines():
            if line == '\n':
                line = line.strip("\n")
            file2.write(line)
    finally:
        file1.close()
        file2.close()
    if os.path.exists(file_path + '.tmp'):
        os.remove(file_path + '.tmp')

if __name__ == '__main__':
    for x in ['train_data/det/train.txt','train_data/det/val.txt','train_data/det/test.txt']:
        clearBlankLine(x)
    for x in ['train_data/rec/train.txt','train_data/rec/val.txt','train_data/rec/test.txt']:
        clearBlankLine(x)
UserWangZz commented 1 week ago

目前distance为负在后处理阶段和前处理阶段均有出现,我们已经记录问题,尝试排查 后处理阶段为负数,可能与模型输出检测框形状异常有关