ATSS - Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection - Common method +2% mAP@.5

Kyuuki93 commented 4 years ago

Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection

paper https://arxiv.org/pdf/1912.02424.pdf
code https://github.com/sfzhang15/ATSS

Main idea, RetinaNet (anchor-based) and FCOS (anchor-free) with same strategy to definite positive and negative samples got nearly same results:

So, instead of IoU thresholds (anchor-based) or Scale range (anchor-free), proposed ATSS strategy,

In shortly,

select k anchor boxes whose center are closest to the center of g(gt) based on L2 distance, for L level pyramid is k x L anchors
compute the IoU between these candidates and the ground-truth g as Dg
compute mean and standard deviation of Dg as mg and vg
use tg = mg + vg as threshold for this g instead of fixed IoU threshold (this is corresponding to ignore_thresh = 0.7?)

For number of k , baseline without ATSS was 37.8%,

Results on other methods

That could be useful, @AlexeyAB

But I just checked the yolov3 paper and code, yolo use single thresh to define positive sample and negative sample, actually with truth_thresh = 1.0 there no positive sample at all, am I right? If yes, the thresh from atss is corresponding to a pair ignore_thresh and truth_thresh with same value?

Kyuuki93 commented 4 years ago

There are another paper trying to dealing with anchor chosen problem,

Multiple Anchor Learning for Visual Object Detection
https://arxiv.org/abs/1912.02252

AlexeyAB commented 4 years ago

@Kyuuki93

select k anchor boxes whose center are closest to the center of g(gt) based on L2 distance, for L level pyramid is k x L anchors

As I understand "select k detection boxes" instead of "select k anchor boxes", since anchor have only WxH and doesn't have Center.

But I just checked the yolov3 paper and code, yolo use single thresh to define positive sample and negative sample, actually with truth_thresh = 1.0 there no positive sample at all, am I right? If yes, the thresh from atss is corresponding to a pair ignore_thresh and truth_thresh with same value?

For Detections (takes into account x,y,w,h for IoU calculation):

if IoU < ingore_thresh - then objectness will be decreased
if IoU > ingore_thresh - then objectness will not be decreased
if IoU > truth_thresh - then objectness & class_probability will be increased, and x,y,w,h will be adjusted

For Anchors (takes into account x,y for IoU calculation - we consider that the x,y-coordinates are equal between Anchor and Truth):

if IoU > iou_thresh - then objectness & class_probability will be increased, and x,y,w,h will be adjusted
ifiou_thresh==0 (0 by default - same thing that it is not in the original repo), then will be used only the 1 anchor with the highest IoU (best_iou)

Kyuuki93 commented 4 years ago

@AlexeyAB

As I understand "select k detection boxes" instead of "select k anchor boxes", since anchor have only WxH and doesn't have Center.

You are right, my misunderstanding .

For Detections:

if IoU < ingore_thresh - then objectness will be decreased

if IoU > ingore_thresh - then objectness will not be decreased

if IoU > truth_thresh - then objectness & class_probability will be increased, and x,y,w,h will be adjusted

if truth_thresh = 1.0, 3rd line actually not working, that's right? And mentioned in yolov3 paper, dual thresh didn't get better result in yolo. It's seems there are some experiments need to do.

For Anchors (we consider that the x,y-coordinates are equal between Anchor and Truth):

if IoU > iou_thresh - then objectness & class_probability will be increased, and x,y,w,h will be adjusted

So, by adding iou_thresh, yolo able to learn from class_probability and x,y,w,h loss, for example, in yolov3 with mse loss and set truth_thresh=1.0 without iou_thresh, class_probability and x,y,w,h loss was always 0?

AlexeyAB commented 4 years ago

if truth_thresh = 1.0, 3rd line actually not working, that's right?

Yes.

So, by adding iou_thresh, yolo able to learn from class_probability and x,y,w,h loss, for example, in yolov3 with mse loss and set truth_thresh=1.0 without iou_thresh, class_probability and x,y,w,h loss was always 0?

I supplemented my answer and added case if iou_thresh==0 to my answer: https://github.com/AlexeyAB/darknet/issues/4500#issuecomment-564681330

Without iou_thresh Yolo can correct x,y,w,h,prob,objectness only for 1 anchor with the best IoU (only if truth_thresh < 1 then can be corrected more than 1 anchors by taking into account x,y,w,h for IoU-calculation)
With iou_thresh Yolo can correct x,y,w,h,prob,objectness for many anchors - for all anchors in this layer with IoU > iou_thresh by taking into account only x,y for IoU-calculation)

AlexeyAB / darknet

ATSS - Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection - Common method +2% mAP@.5 #4500