需要 LabelImg 转换 VOC 数据划分 - Githubissues

PaddlePaddle / PaddleDetection

Object Detection toolkit based on PaddlePaddle. It supports object detection, instance segmentation, multiple object tracking and real-time multi-person keypoint detection.

Apache License 2.0

12.62k stars 2.87k forks source link

需要 LabelImg 转换 VOC 数据划分 #7343

Closed monkeycc closed 1 year ago

monkeycc commented 1 year ago

问题确认 Search before asking

[X] 我已经查询历史issue，没有类似需求。I have searched the issues and found no similar feature requests.

需求描述 Feature Description

现有数据集格式

dataset/xxx/
├── annotations
│   ├── xxx1.xml
│   ├── xxx2.xml
│   ├── xxx3.xml
│   |   ...
├── images
│   ├── xxx1.jpg
│   ├── xxx2.jpg
│   ├── xxx3.jpg

需要划分数据集 https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.5/docs/tutorials/data/PrepareDetDataSet.md

# 生成 label_list.txt 文件
>>echo -e "speedlimit\ncrosswalk\ntrafficlight\nstop" > label_list.txt

在win中不能生成

Write-Output : 无法处理参数，因为参数名称“e”具有二义性。可能的匹配项包括:  -ErrorAction -ErrorVariable。
所在位置 行:1 字符: 6
+ echo -e "F:\00Biaozhu\Annotations" > ...
+      ~~
    + CategoryInfo          : InvalidArgument: (:) [Write-Output]，ParameterBindingException
    + FullyQualifiedErrorId : AmbiguousParameter,Microsoft.PowerShell.Commands.WriteOutputCommand

官方自带 labelme数据转换为COCO数据 tools/x2coco.py 但是没有 LabelImg 转换 VOC 数据划分的脚本

是否愿意提交PR Are you willing to submit a PR?

[ ] Yes I'd like to help by submitting a PR!

wangxinxin08 commented 1 year ago

我们评估下需求，你可以先在网上搜索下相关的处理脚本自行解决下