Open CangHaiQingYue opened 2 months ago
Thanks for your attention to our work!
To use YOLOV in detector with multiple anchors, you should rewrite the feature selection function(https://github.com/YuHengsss/YOLOV/blob/2ea4eb90a44cb3db791c1ac5aac38685ebbc297c/yolox/models/yolovp_msa.py#L307). Concretely, find the foreground proposals and their corresponding features. Note that multiple foreground proposals may correspond to one feature point. In such cases, one feature will be repeated multiple times. Given this concern, we choose the anchor-free detector to conduct our experiment. However, our strategy should also work in these anchor-based detectors. Any attempt is appreciated and we look forward to hearing your success!
Hi @YuHengsss , thanks for your answer, I solved this by mapping [0, 8040 * 3-1] to [0, 8040-1], Cause 3 repersent anchor number, which means one point had been repeated three times.
However, I encountered another problem, which is about ref_loss. I didn't see this part in paper. Can you explain this part in detail for me?
This is a classification refinement loss for the video object detection, and you can also find the IoU score refinement loss if you use YOLOV++. They are intended to optimize the classification and confidence of the object after the temporal refinement block. You can find the assignment strategy used in YOLOV for classification part as follows: https://github.com/YuHengsss/YOLOV/blob/2ea4eb90a44cb3db791c1ac5aac38685ebbc297c/yolox/models/yolovp_msa.py#L590C21-L605C1
YOLOV++ updated the label assignment strategy to get better performance. It's a little bit complex and you can find them in https://github.com/YuHengsss/YOLOV/blob/2ea4eb90a44cb3db791c1ac5aac38685ebbc297c/yolox/models/v_plus_head.py#L1052 and https://github.com/YuHengsss/YOLOV/blob/2ea4eb90a44cb3db791c1ac5aac38685ebbc297c/yolox/models/v_plus_head.py#L447
Hello, thanks for your great job! YOLOX is based on an anchor-free algorithm, and I would like to use YOLOX's ideas in the anchor-based algorithm. Now I have a question that I would like to ask:
Currently,
self. n_anchors=1
. For an image with an input size of 1x3x640x640, the shape offeature_cls
should be 1x8040x192, while the range of values inpred_idx
is [0, 8039] This is okay.However, when
self. n_anchors=3
, the shape offeatures_cls
is still 1x8040x192, but the value ofpred_idx
is [0, 8040 * 3-1], then an error will be reported in the functionself. find_feature_store
.So I would like to ask how to resolve this conflict. It seems impractical to simply repeat
features_cls
three times. https://github.com/YuHengsss/YOLOV/blob/2ea4eb90a44cb3db791c1ac5aac38685ebbc297c/yolox/models/yolovp_msa.py#L290-L311