TexasInstruments / edgeai-yolov5

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite. Forked from https://ultralytics.com/yolov5
https://github.com/TexasInstruments/edgeai
GNU General Public License v3.0
654 stars 120 forks source link

Some questions about the converted label format? #76

Open Hezhexi2002 opened 2 years ago

Hezhexi2002 commented 2 years ago

❔Question

In regular yolo label format there will be CLASS+xywh(normalized) 5 columns in total,and I used to train on yolov5-face which add 8 coordinates(normalized) in the end of the keypoints of face.So I want to figured out why there are 56 columns in your converted labels? image

Additional context

I assume the 56 columns include 34 coordinates of 17 kps plus the origin CLASS+xywh(normalized) but there are still 56-(34+5)=17 columns,so I guess maybe the coordinates of 17 kps are not only include x,y but also the z,In that case,there will be 17x3+5=56 columns just same as you mentioned in your code.But I'm not sure the guess is right because I haven't learned about the human pose estimation before,I really need to know that because I want to my custom datasets which only contains 4 kps but with more classes instead of just person.I'd appreciate it if someone can give me some advice.

lxiaoxiaoxing commented 2 years ago

❔问题

在常规的yolo标签格式中,总共会有5列CLASS+xywh(归一化),我曾经在yolov5-face上训练,在面部关键点的末尾添加8个坐标(归一化)。所以我想弄清楚为什么转换后的标签中有 56 列? 图片

附加上下文

我假设 56 列包括 34 个 17 kps 的坐标加上原点 CLASS+xywh(normalized) 但仍然有 56-(34+5)=17 列,所以我猜可能 17 kps 的坐标不仅包括 x, y 还有 z,在这种情况下,将有 17x3+5=56 列,与您在代码中提到的一样。但我不确定猜测是否正确,因为我之前没有了解人体姿势估计,我真的需要知道这一点,因为我想要我的自定义数据集,它只包含 4 kps,但有更多的类,而不仅仅是人。如果有人能给我一些建议,我将不胜感激。

每个关键点坐标后跟的有类别,0代表不存在,1代表遮挡,2代表存在

Hezhexi2002 commented 2 years ago

❔问题

在常规的yolo标签格式中,总共会有5列CLASS+xywh(归一化),我曾经在yolov5-face上训练,在面部关键点的末尾添加8个坐标(归一化)。所以我想弄清楚为什么转换后的标签中有 56 列? 图片

附加上下文

我假设 56 列包括 34 个 17 kps 的坐标加上原点 CLASS+xywh(normalized) 但仍然有 56-(34+5)=17 列,所以我猜可能 17 kps 的坐标不仅包括 x, y 还有 z,在这种情况下,将有 17x3+5=56 列,与您在代码中提到的一样。但我不确定猜测是否正确,因为我之前没有了解人体姿势估计,我真的需要知道这一点,因为我想要我的自定义数据集,它只包含 4 kps,但有更多的类,而不仅仅是人。如果有人能给我一些建议,我将不胜感激。

每个关键点坐标后跟的有类别,0代表不存在,1代表遮挡,2代表存在

哦哦,原来如此,谢谢大佬,这样一来确实就是17*3+5=56列了,我再去看看代码:-)

ZhangLe-fighting commented 2 years ago

❔问题

在常规的yolo标签格式中,总共会有5列CLASS+xywh(归一化),我曾经在yolov5-face上训练,在面部关键点的末尾添加8个坐标(归一化)。所以我想弄清楚为什么转换后的标签中有 56 列? 图片

附加上下文

我假设 56 列包括 34 个 17 kps 的坐标加上原点 CLASS+xywh(normalized) 但仍然有 56-(34+5)=17 列,所以我猜可能 17 kps 的坐标不仅包括 x, y 还有 z,在这种情况下,将有 17x3+5=56 列,与您在代码中提到的一样。但我不确定猜测是否正确,因为我之前没有了解人体姿势估计,我真的需要知道这一点,因为我想要我的自定义数据集,它只包含 4 kps,但有更多的类,而不仅仅是人。如果有人能给我一些建议,我将不胜感激。

每个关键点坐标后跟的有类别,0代表不存在,1代表遮挡,2代表存在

您好,请问我在导出onnx模型的时候遇到了如下问题: image 报错为: Warning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied. Warning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied. [W shape_type_inference.cpp:425] Warning: Constant folding in symbolic shape inference fails: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper__index_select) (function ComputeConstantFolding) Warning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied. Warning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied. Warning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied. ONNX: export failure: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper__index_select)

请问您知道是什么原因导致的嘛?搜了很多教程,发现并不能解决我的问题,期待回复!

ryouchinsa commented 1 year ago

RectLabel is an offline image annotation tool for object detection and segmentation. With RectLabel, you can import the COCO keypoints format and export to the YOLO keypoints format.

class_index center_x center_y width height x1 y1 v1 x2 v2 y2 x3 y3 v3 ...
0 0.545230 0.616880 0.298794 0.766239 0.522073 0.309332 2 0.540170 0.293193 2 0.499589 0.296503 2 ...

A visibility flag v defined as v=0: not labeled (in which case x=y=0), v=1: labeled but not visible, and v=2: labeled and visible. https://cocodataset.org/#format-data

yolo_polygon