Open stiansel opened 2 days ago
Hi,
Thanks for raising this issue! I'm also working on figuring out why the convergence speed isn't as good as the stable version. Yesterday, I made some changes in commit fd5413f77d03f91b48eebba7dc1b98582bee93ad. If you have time, feel free to take a look.
Here’s a summary of the changes:
I believe some of these changes align with what you mentioned in this issue. Moving forward, I'll add no_grad to CIoU and implement logic to filter out extremely small bounding boxes. I’ll also retrain the model to check if the convergence issue still persists.
The changes so far only impacted the loss by about 0.001, so I suspect the issue may still exist. Let’s continue troubleshooting together!
Best regards, Henry Tsui
I saw the note in the readme about slower convergence and thought I'd try to help. These are the potential issues I've seen, though there may be others as well.
BoxMatcher:
When doing normalization of align_cls the target_matrix hasn't been adjusted to account for the topk mask, and which GTs actually won the duplicate step. The same with the iou_mat. This yields different align_cls values in some assignments when comparing to other YOLO implementations.
Duplicate assignments in YOLO MIT are filtered based on the full cost matrix. Duplicate resolution in other single shot detector variants appear to use the (C)IoU cost only. Not sure how this affects the training.
While the no box case was fixed in #88 , there still might be a rare issue if there are target boxes, but they are all too small to overlap with the anchors. This may not be an issue unless doing custom datasets/loaders though.
Loss