WongKinYiu / yolov7

Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
GNU General Public License v3.0
13.19k stars 4.17k forks source link

About the training code of YOLOv7-pose #234

Open Truthonlyone opened 2 years ago

Truthonlyone commented 2 years ago

I am interested in the YOLOv7-pose which was posted yesterday, and I wonder if the training code would be released?

WongKinYiu commented 2 years ago

For keypoint detection: merge this into this and modify this for training.

HiiHongFe commented 2 years ago

For keypoint detection: merge this into this and modify this for training.

Thank you for your contribution!But does the current project not support training pose? I try to train, but many mistakes will occur.

vshesh commented 2 years ago

For keypoint detection: merge this into this and modify this for training.

How does one modify the yaml model for training? Is it as simple as adding nkpt: ___ to the top of the file and adding nkpt to the detect line? (that's the main difference I see between linked repo yolov5-pose and yolov5 base from ultralytics). Can you also explain what the data format would be in that case? Is it the same as the yolov5-pose repo? I see class x y width height... there followed by three numbers per keypoint. What does the third number represent? It seems to take values from 0-2.

WongKinYiu commented 2 years ago

Training code is released.

maketo97 commented 2 years ago

@WongKinYiu The training code supports the custom keypoints labelled dataset?

WongKinYiu commented 2 years ago

Yes, but not yet support multiple classes.

Truthonlyone commented 2 years ago

Training code is released.

Wow! Thx~

maketo97 commented 2 years ago

@WongKinYiu If I want to train the model to detect more keypoints, which part of the coding should I modify? Thanks

vshesh commented 2 years ago

@WongKinYiu If I want to train the model to detect more keypoints, which part of the coding should I modify? Thanks

You would change nkpt in the cfg/yolov7-w6-pose.yaml file to whatever number of keypoints you want

maketo97 commented 2 years ago

@vshesh Not need to change the anchor of cfg file to fit the new number of keypoints?

vshesh commented 2 years ago

@vshesh Not need to change the anchor of cfg file to fit the new number of keypoints?

I'm by no means an expert, but what I see between https://github.com/WongKinYiu/yolov7/blob/pose/cfg/yolov7-w6-pose.yaml and the base cfg/training/yolov7-w6.yaml is that the anchors are the same. Diffing those two files should show the changes: https://diffonline.net/ljGvoRTYNz

Seems it's just adding keypoints, adding the dw_conv_kpt line (which I don't know what it does) and changing the detection logic at the bottom of the network.

maketo97 commented 2 years ago

@vshesh I have tried to change nkpt, but the training fails as the coding is hardcoded for 17 keypoints

vshesh commented 2 years ago

@vshesh I have tried to change nkpt, but the training fails as the coding is hardcoded for 17 keypoints

Ok yeah, I'm seeing the same error now:

train: WARNING: Ignoring corrupted image and/or label scratch/train/images/img10431471441771711261.png: labels require 56 columns each

then it thinks 100% of the data is corrupted and then fails to train because there's no data in the cache with:

Traceback (most recent call last):
  File "train.py", line 562, in <module>
    train(hyp, opt, device, tb_writer)
  File "train.py", line 204, in train
    image_weights=opt.image_weights, quad=opt.quad, prefix=colorstr('train: '), kpt_label=kpt_label)
  File "/content/yolov7/utils/datasets.py", line 74, in create_dataloader
    kpt_label=kpt_label)
  File "/content/yolov7/utils/datasets.py", line 414, in __init__
    labels, shapes, self.segments = zip(*cache.values())
ValueError: not enough values to unpack (expected 3, got 0)

https://github.com/WongKinYiu/yolov7/blob/pose/utils/datasets.py#L496 @maketo97 here's the hardcoding, you can try to change it.

vshesh commented 2 years ago

I got as far as fixing the data loading process to start the training, but then ran into this error which I can't solve:


autoanchor: Analyzing anchors... anchors/target = 7.60, Best Possible Recall (BPR) = 1.0000
Image sizes 960 train, 960 test
Using 2 dataloader workers
Logging results to runs/train/yolov7-w6-pose10
Starting training for 300 epochs...

     Epoch   gpu_mem       box       obj       cls       kpt      kptv     total    labels  img_size
  0% 0/3 [00:05<?, ?it/s]
Traceback (most recent call last):
  File "train.py", line 563, in <module>
    train(hyp, opt, device, tb_writer)
  File "train.py", line 319, in train
    loss, loss_items = compute_loss(pred, targets.to(device))  # loss scaled by batch_size
  File "/content/yolov7/utils/loss.py", line 120, in __call__
    tcls, tbox, tkpt, indices, anchors = self.build_targets(p, targets)  # targets
  File "/content/yolov7/utils/loss.py", line 207, in build_targets
    t = targets * gain
RuntimeError: The size of tensor a (31) must match the size of tensor b (41) at non-singleton dimension 2

So will just rely on the author of the repo to help now.

maketo97 commented 2 years ago

@vshesh I also met the same issue as you. No idea to how to modify the gain or target for matching the anchor in cfg file.

vshesh commented 2 years ago

Ok I fixed that problem, I forgot to change nkpts in the model file when I reloaded the colab space. Now I have a new issue:

Traceback (most recent call last):
  File "train.py", line 563, in <module>
    train(hyp, opt, device, tb_writer)
  File "train.py", line 319, in train
    loss, loss_items = compute_loss(pred, targets.to(device))  # loss scaled by batch_size
  File "/content/yolov7/utils/loss.py", line 119, in __call__
    tcls, tbox, tkpt, indices, anchors = self.build_targets(p, targets)  # targets
  File "/content/yolov7/utils/loss.py", line 234, in build_targets
    indices.append((b, a, gj.clamp_(0, gain[3] - 1), gi.clamp_(0, gain[2] - 1)))  # image, anchor, grid indices
RuntimeError: result type Float can't be cast to the desired output type long int

@maketo97 do you know more about what the torch.clamp_ function is and why this is happening? i'm surprised by this one because how does the regular training process work?

maketo97 commented 2 years ago

@vshesh Hmm.. You mean changing the nkpts in yolov7-w6-pose.yaml? If yes, may I know what is the number of keypoints you used currently?

I also first to this torch.clamp_ function, but from the coding log, since like the result type is not fitting the desired output data format. Maybe can try to change the type of result type from 'Float' to 'Long integer'

Btw, maybe we can exchange contact for discussion and communication?

maketo97 commented 2 years ago

@vshesh You need to change the original line indices.append((b, a, gj.clamp_(0, gain[3] - 1), gi.clamp_(0, gain[2] - 1))) # image, anchor, grid indices to this new line to solve the error. indices.append((b, a, gj.clamp_(0, gain[3] - 1).long(),gi.clamp_(0, gain[2] - 1).long())) # image, anchor, grid indices

vshesh commented 2 years ago

@vshesh You need to change the original line indices.append((b, a, gj.clamp_(0, gain[3] - 1), gi.clamp_(0, gain[2] - 1))) # image, anchor, grid indices to this new line to solve the error. indices.append((b, a, gj.clamp_(0, gain[3] - 1).long(),gi.clamp_(0, gain[2] - 1).long())) # image, anchor, grid indices

ok cool, does training work for you after that?

maketo97 commented 2 years ago

@vshesh The training works but there is error with my foot keypoints connections. Sometimes, the foot keypoints connections work properly (Figure 1), sometimes there is misalignment with the foot keypoints connections (Figure 2). Last step to train the model with correct keypoints combinations, but no idea how to solve this error.

train_batch1 Figure 1

train_batch25 Figure 2

vshesh commented 2 years ago

@maketo97 how did you deal with the sigmas in the loss function?

        sigmas = torch.tensor([.26, .25, .25, .35, .35, .79, .79, .72, .72, .62, .62, 1.07, 1.07, .87, .87, .89, .89], device=device) / 10.0

They are hardcoded to be 17 elements long, and I'm not sure what to replace them with or why there are different sigmas for different keypoints. does 0.25 for each work?

maketo97 commented 2 years ago

@maketo97 how did you deal with the sigmas in the loss function?

        sigmas = torch.tensor([.26, .25, .25, .35, .35, .79, .79, .72, .72, .62, .62, 1.07, 1.07, .87, .87, .89, .89], device=device) / 10.0

They are hardcoded to be 17 elements long, and I'm not sure what to replace them with or why there are different sigmas for different keypoints. does 0.25 for each work?

@vshesh For the moment, you can set number of 1 based on the number of keypoints that you want. Lets say you want 18 keypoints, you repeat putting 1 by 18 times.

YaoBeiji commented 2 years ago

@maketo97 I am also working on Body + Foot, but the label of YOLO is different from that of Coco,how to generate the label of body+foot dataset for yolo?

maketo97 commented 2 years ago

@YaoBeiji You can use this repository to generate the labels of body + foot dataset for yolo.

nguyenanhtuan1008 commented 2 years ago

For keypoint detection: merge this into this and modify this for training.

How does one modify the yaml model for training? Is it as simple as adding nkpt: ___ to the top of the file and adding nkpt to the detect line? (that's the main difference I see between linked repo yolov5-pose and yolov5 base from ultralytics). Can you also explain what the data format would be in that case? Is it the same as the yolov5-pose repo? I see class x y width height... there followed by three numbers per keypoint. What does the third number represent? It seems to take values from 0-2.

@vshesh @maketo97 @WongKinYiu what is the meaning of the 3rd number representing each key point? 0.000000, 1.000000, 2.000000 -> could you explain this number in .txt files?

InnovArul commented 2 years ago

From the below line, it seems like [0.0000, 1.0000, 2.0000] describes about occlusion. But they are removed from labels anyway. So I think they don't have any significance in YoLo training?

https://github.com/WongKinYiu/yolov7/blob/cad7acac832fcd4a9c2e09e773050a57761e22b9/utils/datasets.py#L502

nguyenanhtuan1008 commented 2 years ago

From the below line, it seems like [0.0000, 1.0000, 2.0000] describes about occlusion. But they are removed from labels anyway. So I think they don't have any significance in YoLo training?

https://github.com/WongKinYiu/yolov7/blob/cad7acac832fcd4a9c2e09e773050a57761e22b9/utils/datasets.py#L502

Thank you so much for your reply.

nguyenanhtuan1008 commented 2 years ago

@vshesh You need to change the original line indices.append((b, a, gj.clamp_(0, gain[3] - 1), gi.clamp_(0, gain[2] - 1))) # image, anchor, grid indices to this new line to solve the error. indices.append((b, a, gj.clamp_(0, gain[3] - 1).long(),gi.clamp_(0, gain[2] - 1).long())) # image, anchor, grid indices

@vshesh @maketo97 @InnovArul I have another issue while trying to training COCO dataset. I got the error showing that

Exception has occurred: RuntimeError
result type Float can't be cast to the desired output type __int64
  File "F:\yolov7_pose\utils\loss.py", line 236, in build_targets
    indices.append((b, a, gj.clamp_(0, gain[3] - 1), gi.clamp_(0, gain[2] - 1)))  # image, anchor, grid indices
  File "F:\yolov7_pose\utils\loss.py", line 120, in __call__
    tcls, tbox, tkpt, indices, anchors = self.build_targets(p, targets)  # targets
  File "F:\yolov7_pose\train.py", line 306, in train
    loss, loss_items = compute_loss(pred, targets.to(device))  # loss scaled by batch_size
  File "F:\yolov7_pose\train.py", line 550, in <module>
    train(hyp, opt, device, tb_writer)

And then I change the code

indices.append((b, a, gj.clamp_(0, gain[3] - 1), gi.clamp_(0, gain[2] - 1))) # image, anchor, grid indices
to:
indices.append((b, a, gj.clamp_(0, gain[3] - 1).long(),gi.clamp_(0, gain[2] - 1).long())) # image, anchor, grid indices

But then I got another error:

Exception has occurred: RuntimeError
result type Float can't be cast to the desired output type __int64
  File "F:\yolov7_pose\utils\loss.py", line 236, in build_targets
    indices.append((b, a, gj.clamp_(0, gain[3] - 1).long(), gi.clamp_(0, gain[2] - 1)).long())  # image, anchor, grid indices
  File "F:\yolov7_pose\utils\loss.py", line 120, in __call__
    tcls, tbox, tkpt, indices, anchors = self.build_targets(p, targets)  # targets
  File "F:\yolov7_pose\train.py", line 306, in train
    loss, loss_items = compute_loss(pred, targets.to(device))  # loss scaled by batch_size
  File "F:\yolov7_pose\train.py", line 550, in <module>
    train(hyp, opt, device, tb_writer)

Any advice?

YaoBeiji commented 2 years ago

image @maketo97 Have you solved this problem?

maketo97 commented 2 years ago

@nguyenanhtuan1008

  1. Remove the change that has been done to the line
    indices.append((b, a, gj.clamp_(0, gain[3] - 1).long(),gi.clamp_(0, gain[2] - 1).long())) # image, anchor, grid indices
    to 
    indices.append((b, a, gj.clamp_(0, gain[3] - 1), gi.clamp_(0, gain[2] - 1))) # image, anchor, grid indices
  2. Add an extra line before line 207
    # Match targets to anchors
    gain = gain.long()
    t = targets * gain
maketo97 commented 2 years ago

image @maketo97 Have you solved this problem?

Still haven't figured out a way...

YaoBeiji commented 2 years ago

maketo97 I think I know,dataset.py line365 ,set self. flip_index to the left and right correspondence of keypoints. I have trained with this setting, but the confidenceof the foot is too low ,only 0.2 or less, so the position is shifted. image

maketo97 commented 2 years ago

@YaoBeiji I think it is posssible, not sure the low confidence issue is related to the architecture issue or not. Btw, could you share your contact to this email address? hii707066@gmail.com. Easier for discussion.

lufanma commented 2 years ago

https://github.com/WongKinYiu/yolov7/blob/pose/utils/datasets.py#L496

@vshesh Hi~ I'm also studying how to train yolov7-pose with customized pose dataset(single-category car keypoint detection), and I'm wondering how to change this repo.

Could you help me where need to change except the yaml file? And how did you set the loss weight to each point?

Thank you very much !!!

tctco commented 2 years ago

Hi, I'm using the model to do customized pose estimation. I've modified the code and the program runs smoothly. The bounding box is okay, but the keypoint preds are not good.

image

What could cause the problem?

KAWAKO-in-GAYHUB commented 1 year ago

https://github.com/WongKinYiu/yolov7/blob/pose/utils/datasets.py#L496

@vshesh Hi~ I'm also studying how to train yolov7-pose with customized pose dataset(single-category car keypoint detection), and I'm wondering how to change this repo.

Could you help me where need to change except the yaml file? And how did you set the loss weight to each point?

Thank you very much !!!

I have the same question. Have you solve it?

aoqiangma commented 1 year ago

@vshesh I have tried to change nkpt, but the training fails as the coding is hardcoded for 17 keypoints

Ok yeah, I'm seeing the same error now:

train: WARNING: Ignoring corrupted image and/or label scratch/train/images/img10431471441771711261.png: labels require 56 columns each

then it thinks 100% of the data is corrupted and then fails to train because there's no data in the cache with:

Traceback (most recent call last):
  File "train.py", line 562, in <module>
    train(hyp, opt, device, tb_writer)
  File "train.py", line 204, in train
    image_weights=opt.image_weights, quad=opt.quad, prefix=colorstr('train: '), kpt_label=kpt_label)
  File "/content/yolov7/utils/datasets.py", line 74, in create_dataloader
    kpt_label=kpt_label)
  File "/content/yolov7/utils/datasets.py", line 414, in __init__
    labels, shapes, self.segments = zip(*cache.values())
ValueError: not enough values to unpack (expected 3, got 0)

https://github.com/WongKinYiu/yolov7/blob/pose/utils/datasets.py#L496 @maketo97 here's the hardcoding, you can try to change it.

I've been plagued by this problem for a week, and it hasn't been solved. If my friend has solved it, can you tell me the solution in the comments?

changbaishanwwg commented 1 year ago

Hi, I'm using the model to do customized pose estimation. I've modified the code and the program runs smoothly. The bounding box is okay, but the keypoint preds are not good.

image

What could cause the problem?

@tctco Hello, could you please tell me how to modify the code so that I can train custom keypoints labelled dataset? 大佬,能教教我怎么改代码来训练特点数量关键点吗?

tctco commented 1 year ago

@changbaishanwwg It has been some time since I conducted the experiment and I can no longer remember the details, and it is also quite difficult to elaborate modifications to the source code here :( I suggest you start from datasets.py, loss.py, plots.py, and also the config file. There may be some other places where you need some customized modifications, and you could further do that according to errors raised during training.

Hope you find the information helpful :)

changbaishanwwg commented 1 year ago

@changbaishanwwg It has been some time since I conducted the experiment and I can no longer remember the details, and it is also quite difficult to elaborate modifications to the source code here :( I suggest you start from datasets.py, loss.py, plots.py, and also the config file. There may be some other places where you need some customized modifications, and you could further do that according to errors raised during training.

Hope you find the information helpful :)

@tctco I'm sorry I didn't reply to you in time. Thank you for your kind reply. I will try my best to modify the code according to your suggestions. By the way, have you finally solved the problem that the keypoints are beyond the bounding box? If so, could you please share your thoughts?

tctco commented 1 year ago

@changbaishanwwg It seems that the shifted keypoints have been an issue for quite some time and are related to the loss function (the impact of the OKS loss function is somehow bizarre...). You could find a more detailed discussion in the original repo.

Hope this answer is helpful :)

changbaishanwwg commented 1 year ago

@tctco Your reply has given me great help. Thank you very much.

Wish you success in your future research :)

nomaad42 commented 1 year ago

Hi! @vshesh @maketo97

Have you figured out how to find a proper sigma values?

schemaphicGopa commented 10 months ago
 Epoch   gpu_mem       box       obj       cls       kpt      kptv     total    labels  img_size
  0/99     1.91G   0.07855     1.293         0    0.9307   0.01172     2.314         2       640:   0%|                          | 0/5 [00:07<?, ?it/s]

Traceback (most recent call last): File "D:\project\yolov7-pose-custom-pose\train.py", line 568, in train(hyp, opt, device, tb_writer) File "D:\project\yolov7-pose-custom-pose\train.py", line 351, in train plot_images(imgs, targets, paths, f, kpt_label=kpt_label, nkpt=nkpt) File "D:\project\yolov7-pose-custom-pose\utils\plots.py", line 258, in plot_images plot_one_box(box, mosaic, label=label, color=color, line_thickness=tl, kpt_label=kpt_label, kpts=kpts[:,j], steps=steps, orig_shape=orig_shape) File "D:\project\yolov7-pose-custom-pose\utils\plots.py", line 84, in plot_one_box plot_skeleton_kpts(im, kpts, steps, orig_shape=orig_shape) File "D:\project\yolov7-pose-custom-pose\utils\plots.py", line 104, in plot_skeleton_kpts pose_kpt_color = palette[list(range(num_kpts))] IndexError: index 4 is out of bounds for axis 0 with size 4

how to solve

kangombec commented 9 months ago
 Epoch   gpu_mem       box       obj       cls       kpt      kptv     total    labels  img_size
  0/99     1.91G   0.07855     1.293         0    0.9307   0.01172     2.314         2       640:   0%|                          | 0/5 [00:07<?, ?it/s]

Traceback (most recent call last): File "D:\project\yolov7-pose-custom-pose\train.py", line 568, in train(hyp, opt, device, tb_writer) File "D:\project\yolov7-pose-custom-pose\train.py", line 351, in train plot_images(imgs, targets, paths, f, kpt_label=kpt_label, nkpt=nkpt) File "D:\project\yolov7-pose-custom-pose\utils\plots.py", line 258, in plot_images plot_one_box(box, mosaic, label=label, color=color, line_thickness=tl, kpt_label=kpt_label, kpts=kpts[:,j], steps=steps, orig_shape=orig_shape) File "D:\project\yolov7-pose-custom-pose\utils\plots.py", line 84, in plot_one_box plot_skeleton_kpts(im, kpts, steps, orig_shape=orig_shape) File "D:\project\yolov7-pose-custom-pose\utils\plots.py", line 104, in plot_skeleton_kpts pose_kpt_color = palette[list(range(num_kpts))] IndexError: index 4 is out of bounds for axis 0 with size 4

how to solve

Hey did you manage to solve this issue and if so what was the solution?

daddywithaphatty commented 5 months ago
 Epoch   gpu_mem       box       obj       cls       kpt      kptv     total    labels  img_size
  0/99     1.91G   0.07855     1.293         0    0.9307   0.01172     2.314         2       640:   0%|                          | 0/5 [00:07<?, ?it/s]

Traceback (most recent call last): File "D:\project\yolov7-pose-custom-pose\train.py", line 568, in train(hyp, opt, device, tb_writer) File "D:\project\yolov7-pose-custom-pose\train.py", line 351, in train plot_images(imgs, targets, paths, f, kpt_label=kpt_label, nkpt=nkpt) File "D:\project\yolov7-pose-custom-pose\utils\plots.py", line 258, in plot_images plot_one_box(box, mosaic, label=label, color=color, line_thickness=tl, kpt_label=kpt_label, kpts=kpts[:,j], steps=steps, orig_shape=orig_shape) File "D:\project\yolov7-pose-custom-pose\utils\plots.py", line 84, in plot_one_box plot_skeleton_kpts(im, kpts, steps, orig_shape=orig_shape) File "D:\project\yolov7-pose-custom-pose\utils\plots.py", line 104, in plot_skeleton_kpts pose_kpt_color = palette[list(range(num_kpts))] IndexError: index 4 is out of bounds for axis 0 with size 4 how to solve

Hey did you manage to solve this issue and if so what was the solution?

4 months late, but I came across a similar issue when I was attempting to create mine. I almost just gave up on it but I saw these comments and decided to "fuck it and figure that shit out." Even if you've solved your issue I hope someone else comes across this.

(I recommend you also read the end part of this response as it may save you a lot of pain)

Here are the things I attempted in order to fix it:

  1. Went to the problematic line and found where num_kpts is defined and added %4+1 to it so it wouldn't go above 4 and end up out of bounds. This doesn't work, because it forces the first 4 key points to actually be the last keypoint.
  2. When I realised how awful that solution was, I tried to sort out another way to fix the problem. The issue is the fact that the original python script is designed to handle FOUR keypoints, not more. I modified mine (utils/plots.py) to look like this: `def plot_skeleton_kpts(im, kpts, steps, orig_shape=None, path="D:\YOLOv7-POSE-on-Custom-Dataset\final_dataset\skeleton.txt"):

    Plot the skeleton and keypointsfor coco datatset

    text = None with open(path, 'r') as f: text = f.read().split('\n')

    skeleton = [] for i in text: j = i.split(' ') skeleton.append(j)

    for i in range(len(skeleton)): for j in range(len(skeleton[i])): skeleton[i][j] = int(skeleton[i][j])

    num_kpts = len(kpts) // steps

    pal = [[255, 128, 0]] for i in range(num_kpts): pal.append([255, 128, 0])

    palette = np.array(pal)

    pose_limb_color = palette[[9, 9, 9, 9, 7, 7, 7, 0, 0, 0, 0, 0, 16, 16, 16, 16, 16, 16, 16]]

    pose_limb_color = palette[[limb[0]-1 for limb in skeleton]]

    pose_kpt_color = palette[[16, 16, 16, 16, 16, 0, 0, 0, 0, 0, 0, 9, 9, 9, 9, 9, 9]]

    pose_kpt_color = palette[[16, 0, 9, 1]]

    radius = 4 min_conf = 0.2 pose_kpt_color = palette[list(range(num_kpts))]

    for kid in range(num_kpts): r, g, b = pose_kpt_color[kid] x_coord, y_coord = kpts[steps kid], kpts[steps kid + 1] if not (x_coord % 640 == 0 or y_coord % 640 == 0): if steps == 3: conf = kpts[steps * kid + 2] if conf < min_conf: r, g, b = [255, 0, 0]

    continue

        cv2.circle(im, (int(x_coord), int(y_coord)), radius, (int(r), int(g), int(b)), -1)

    for sk_id, sk in enumerate(skeleton): try:
    r, g, b = pose_limb_color[sk_id] pos1 = (int(kpts[(sk[0]-1)steps]), int(kpts[(sk[0]-1)steps+1])) pos2 = (int(kpts[(sk[1]-1)steps]), int(kpts[(sk[1]-1)steps+1])) if steps == 3: conf1 = kpts[(sk[0]-1)steps+2] conf2 = kpts[(sk[1]-1)steps+2] if conf1<min_conf or conf2<min_conf: continue if pos1[0]%640 == 0 or pos1[1]%640==0 or pos1[0]<0 or pos1[1]<0: continue if pos2[0] % 640 == 0 or pos2[1] % 640 == 0 or pos2[0]<0 or pos2[1]<0: continue cv2.line(im, pos1, pos2, (int(r), int(g), int(b)), thickness=2) except: pass`

It is important for me to note the following things in this code:

  1. The path object specified in the definition of this function is something I added, it is not included in the original plots.py. It simply specifies the location of a txt file I use as a solution.
  2. The pal variable is simply the colour in (R, G, B) form that you want the lines in the skeleton to be. In the original plots.py, It is 4 different colours, forcefully selected. Truthfully, if you wanted, you could have numpy generate any random colour for each joint, but only if you want unicorn vomit coloured skeletons. I could have made this a little more optimal, but I didn't want to change the original code too much, so I just appended one colour to a list for every keypoint instance to ensure that every line in the skeleton would have a colour and not throw an error. This point here is what solves the issue you have right now, but there are a couple more errors that can happen so you should read the rest.
  3. The original skeleton list is this: [[1, 2], [2, 3], [3, 4]]. This specifies which key points should be connected. 1 connects to 2, 2 connects to 3 and so on. This is only good if you have 4 key points and want them to connect sequentially. Which most of us don't want. To combat that, I created a system to add the keypoints in the correct format to the skeleton list. That is the skeleton.txt file specified in the function. The skeleton.txt file should look like this. (each line is one limb): 1, 2 2, 3 5, 6 7, 2 This will add four instances of a colour to the palette list, and ensure that the limbs are shown correctly.
  4. There is a try/except for enumerating the skeleton. This is because it throws some error, which truthfully, I was too lazy to figure out a good work around. So just do a try except pass and the error goes away, and it doesn't seem to negatively affect it at all.

Now, this is the end part that I said you should definitely read. This will 100% solve your issue. Whichever tutorial you used for your train.py command line, it would have specified which arguments to use. I only realised this after I did my "solution hunting" and its kind of infuriating, but its good to know. The error you are facing, isn't a problematic error. In fact, its existence is purely cosmetic. You know when you open your experiment folders and you get to see the batch_0, 1, 2, 3 etc.? All the functions do, is plot the keypoints so that they show on that image. You don't actually need that function to still train the model to detect the keypoints, it just is so you can see it in the training folder.

If you don't want to read allat, Remove --kpt-label argument from your training command. Forces it not to use any of that skeleton bullshit and solves it. If you want to see the skeleton in the training folder, read all the stuff I mentioned before and follow what I did.