chenscottus commented 1 year ago

Hi,

I converted the yolov7_pose_whole_body_tiny_baseline.pt to onnx so that I can use C++ and TensorRT to infer the video.

There are 405 output per line. From here https://github.com/fan23j/yolov7-pose-whole-body/blob/main/data/data_format.md, it shows from COCO-WholeBody Annotation File Format: // 405 // // 133 x 3 = 399 // "keypoints": list([x, y, v] 17), // "foot_kpts" : list([x, y, v] 6), // "face_kpts" : list([x, y, v] 68), // "lefthand_kpts" : list([x, y, v] 21), // "righthand_kpts" : list([x, y, v] * 21), // // //
// "score" : float, //400 // "foot_score" : float, //401 // "face_score" : float, //402 // "lefthand_score" : float, //403 // "righthand_score" : float, //404 // "wholebody_score" : float, //405

But when I try to decode the output like that, it won't work.

If I decode the output like yolov7-w6-pose (I have been using it from last year), for the first 57 data or the first 5 data (step_width_coco_pose = 57;//num_class + 5 + 2 num_lines_coco_pose + 2 kp_face_pose), it won't work.

Could you please show me how to decode the output?

Many thanks!

-Scott

fan23j commented 1 year ago

If I understand your question correctly, output should be cls, bbox = output[:6] # len 6 keypoints = output[6:] # len 399 where bbox includes the critical x,y + score and keypoints is the first 399 datapoints in data_format.md.

chenscottus commented 1 year ago

Thanks!

I try to run: python3 detect.py --weights yolov7_pose_whole_body_tiny_baseline.pt --source onnx_inference/img.png --save-crop --view-img

Here are errors:

Namespace(agnostic_nms=False, augment=False, classes=None, conf_thres=0.25, device='', exist_ok=False, hide_conf=False, hide_labels=False, img_size=640, iou_thres=0.45, kpt_label=False, line_thickness=3, name='exp', nosave=False, project='runs/detect', save_bin=False, save_conf=False, save_crop=True, save_txt=False, save_txt_tidl=False, source='onnx_inference/img.png', update=False, view_img=True, weights=['yolov7_pose_whole_body_tiny_baseline.pt']) YOLOv5 � 2023-4-4 torch 1.12.0+cu116 CUDA:0 (NVIDIA GeForce RTX 3090 Ti, 24563.375MB)

Fusing layers... /home/dev/.local/lib/python3.8/site-packages/torch/functional.py:478: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:2894.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined] Model Summary: 314 layers, 8862259 parameters, 0 gradients, 19.8 GFLOPS image 1/1 /mnt/d/Workspace2021/models/yolov7_pose_whole_body/yolov7-pose-whole-body-main/onnx_inference/img.png: tensor(0.83398, device='cuda:0') Traceback (most recent call last): File "detect.py", line 202, in detect(opt=opt) File "detect.py", line 101, in detect scale_coords(img.shape[2:], det[:, 6:], im0.shape, kpt_label=kpt_label, step=3) File "/mnt/d/Workspace2021/models/yolov7_pose_whole_body/yolov7-pose-whole-body-main/utils/general.py", line 385, in scale_coords coords[:, [0, 2]] -= pad[0] # x padding IndexError: index is out of bounds for dimension with size 0

chenscottus commented 1 year ago

If I run, it works: python3 detect.py --weight yolov7_pose_whole_body_tiny_baseline.pt --kpt-label --hide-labels --hide-conf --source onnx_inference/img.png --view-img

But the image is like this:

fan23j commented 1 year ago

I will try and recreate this issue later today.

fan23j commented 1 year ago

Hi Scott,

So first, lowering the conf threshold improves the results alot. I think the thresh of 0.5 is way too high. Changing this to 0.1 allows for most of the keypoints to be drawn (this change has been pushed).

In terms of the mapping, there is still an issue. It probably stems from line 106 in the call to scale_coords to scale the keypoint predictions to the original image. Also, it is important to specify the image size during inference to match the resolution during training. I've uploaded a couple inference results with varying input sizes into onnx_inference. I have some other tasks to take care of today, but I hope the inference results should give you some insight into how to debug the scaling. If I have more time tomorrow, I can revisit this to see if I can debug the issue.

chenscottus commented 1 year ago

Many thanks for your efforts! -S

On Fri, Apr 7, 2023 at 9:39 PM Jack Fan @.***> wrote:

Hi Scott,

So first, lowering the conf threshold improves the results alot. I think the thresh of 0.5 is way too high. Changing this to 0.1 allows for most of the keypoints to be drawn (this change has been pushed).

In terms of the mapping, there is still an issue. It probably stems from line 106 in the call to scale_coords to scale the keypoint predictions to the original image. Also, it is important to specify the image size during inference to match the resolution during training. I've uploaded a couple inference results with varying input sizes into onnx_inference. I have some other tasks to take care of today, but I hope the inference results should give you some insight into how to debug the scaling. If I have more time tomorrow, I can revisit this to see if I can debug the issue.

— Reply to this email directly, view it on GitHub https://github.com/fan23j/yolov7-pose-whole-body/issues/2#issuecomment-1500789157, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABHZOKZL7RELK6JBWTKMGJ3XADTYLANCNFSM6AAAAAAWV4TMYA . You are receiving this because you authored the thread.Message ID: @.***>

fan23j commented 1 year ago

have u tried the plot_images function in test.py? It seems to perform the mapping correctly.

fan23j commented 1 year ago

fan23j / yolov7-pose-whole-body

How to decode the output of the onnx #2

5