Closed wanghlsara closed 1 year ago
Hello,
Apologies. If you look at the other open issues, it seems this is a recurring problem that I have yet to solve. I can allocate some time this weekend to look into the issue further. Could you potentially share more frames from your test video? At first I thought the issue was primarily in the plotting functions from detect.py
since I usually obtain good predictions from test.py
that uses plot_mosaic
. But if your test sample seems to suggest instability in the prediction, the mapping may not be the only issue.
I am a bit busy right now, but will definitely revisit this issue this weekend. Thanks.
Could you potentially share more frames from your test video? At first I thought the issue was primarily in the plotting functions from
detect.py
since I usually obtain good predictions fromtest.py
that usesplot_mosaic
. But if your test sample seems to suggest instability in the prediction, the mapping may not be the only issue.
Thanks for your reply! Hope you found the following information useful.
1.The source video to be detected: https://drive.google.com/file/d/1jcEPNzSPXloFTOYpLRISRwue-pHvzUrp/view?usp=sharing
Hello. Currently still in the debugging process. I think the issue may be in the augmentations phase, but still working on it. If you task is a bit more time sensitive, I would recommend this repo: https://github.com/qinggangwu/yolov7-pose_Npoint_Ncla/tree/master. Perhaps you can take the yolo conversion script from my repo and use it in the other repo. Will update this thread as I continue to debug.
Thanks for your reply. I find some hard coding in utils\general.py, for example, the magic numbers 56 and 57 may only work with the coco 17 body keypoints model .
_if nc is None: nc = prediction.shape[2] - 5 if not kptlabel else prediction.shape[2] - 56 # number of classes output = [torch.zeros((0, 57), device=prediction.device)] * prediction.shape[0]
I replace these numbers with 133 whole boy points, it seems to get more accurate and more stable results .
Thanks for the catch!
Good Job! When I ran detect.py using your pretrained model file yolov7-tiny-baseline.pt, I got weired results as shown in the following picture. In addition, the degree to which the keypoints deviated from groundtruth may vary from frame to frame in my test video. That means, the keypoints results are not stable. What's the problem?