Closed xzxorb closed 2 months ago
π Hello @xzxorb, thank you for your interest in Ultralytics YOLOv8 π! We recommend a visit to the Docs for new users where you can find many Python and CLI usage examples and where many of the most common questions may already be answered.
If this is a π Bug Report, please provide a minimum reproducible example to help us debug it.
If this is a custom training β Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.
Join the vibrant Ultralytics Discord π§ community for real-time conversations and collaborations. This platform offers a perfect space to inquire, showcase your work, and connect with fellow Ultralytics users.
Pip install the ultralytics
package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.
pip install ultralytics
YOLOv8 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):
If this badge is green, all Ultralytics CI tests are currently passing. CI tests verify correct operation of all YOLOv8 Modes and Tasks on macOS, Windows, and Ubuntu every 24 hours and on every commit.
Hello!
Thanks for reaching out with your question! π In YOLOv8 pose estimation, handling multiple classes with varying numbers of keypoints typically involves a workaround, as the model architecture expects a fixed number of keypoints per class.
One approach is to modify the model to dynamically adjust the number of output keypoints based on the class detected. This would involve significant changes to the model's architecture and possibly the training pipeline.
Alternatively, you can standardize the number of keypoints across all classes by defining a maximum number of keypoints that covers all classes. Then, for classes with fewer keypoints, you can continue padding the additional keypoints as neutral or background keypoints, ensuring they do not influence the loss significantly during training. Adjustments in the loss calculation to ignore these padded keypoints might help improve the model's performance.
Hereβs a snippet idea on how you might consider adjusting the loss function to handle the padding better:
def custom_loss(y_pred, y_true):
# Assuming y_true is adjusted to have 0s for padded keypoints
mask = (y_true != 0)
loss = (y_pred - y_true)**2 * mask # Only compute loss for actual keypoints
return loss.sum() / mask.sum()
Remember, these suggestions require fine-tuning and testing. Let us know how it goes or if you need more specific guidance!
Good luck with your project! π
Thanks for your help, but I still don't know how to modify the loss function in code, I think the original loss function is like this, which means the mask is already set in loss function.
if masks.any():
gt_kpt = selected_keypoints[masks]
area = xyxy2xywh(target_bboxes[masks])[:, 2:].prod(1, keepdim=True)
pred_kpt = pred_kpts[masks]
kpt_mask = gt_kpt[..., 2] != 0 if gt_kpt.shape[-1] == 3 else torch.full_like(gt_kpt[..., 0], True)
kpts_loss = self.keypoint_loss(pred_kpt, gt_kpt, kpt_mask, area) # pose loss
if pred_kpt.shape[-1] == 3:
kpts_obj_loss = self.bce_pose(pred_kpt[..., 2], kpt_mask.float()) # keypoint obj loss
return kpts_loss, kpts_obj_loss
class KeypointLoss(nn.Module):
def __init__(self, sigmas) -> None:
super().__init__()
self.sigmas = sigmas
def forward(self, pred_kpts, gt_kpts, kpt_mask, area):
d = (pred_kpts[..., 0] - gt_kpts[..., 0]).pow(2) + (pred_kpts[..., 1] - gt_kpts[..., 1]).pow(2)
kpt_loss_factor = kpt_mask.shape[1] / (torch.sum(kpt_mask != 0, dim=1) + 1e-9)
# e = d / (2 * (area * self.sigmas) ** 2 + 1e-9) # from formula
e = d / ((2 * self.sigmas).pow(2) * (area + 1e-9) * 2) # from cocoeval
return (kpt_loss_factor.view(-1, 1) * ((1 - torch.exp(-e)) * kpt_mask)).mean()
Hello!
Thanks for sharing the details of your current loss function implementation. From what I see, you're already employing a masking strategy to selectively adjust the loss computation based on the presence of keypoints (defined by kpt_mask
).
If you need to further modify or customize the behavior specifically for varying numbers of keypoints across different classes, consider adjusting how kpt_mask
is computed. For instance, if you have a set maximum number of keypoints and certain keypoints aren't applicable for some classes, ensure these locations in kpt_mask
are always zeroed out, so they don't contribute to the loss.
Hereβs a brief idea you might try:
kpt_mask = create_dynamic_mask(gt_kpt, max_keypoints, class_ids)
In this hypothetical function create_dynamic_mask
, you'd dynamically generate masks based on the class, ensuring keypoints that aren't applicable are ignored in loss computation.
The goal here is to make sure the loss calculation focuses only on relevant keypoints for each class. Let me know if this helps, or if there's a specific area in the code modification that is unclear! π
Thank you, I will try this later.
You're welcome! If you have any more questions or need further assistance after trying it out, feel free to reach out. Happy coding! π
π Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.
For additional resources and information, please see the links below:
Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!
Thank you for your contributions to YOLO π and Vision AI β
Search before asking
Question
Hi! I am using the yolov8-pose model for keypoints detection, I want to detect multi class and each class has a different number of keypoints, but the number of keypoints can't be different. I have tried to pad 0 for the class with less keypoints. But the result is very bad. How can I modify the code to support inputting different number of keypoints and processing them respectively? Could you give me some advice? Thank you very much.
Additional
No response