Closed ys0823 closed 4 years ago
Thanks for your question. Sorry for the late reply since I was a little busy these days.
x&y
is just one simple choice. From Figure 2 in the paper, the distribution of w&h
are also similar, so we just use x&y
for simplicity.regression error
is the offset from the candidate box to the target ground truth box. So I have no idea where is the problem.KE
can be viewed as a percentage ratio of the positives. For example, if KI=75
, selecting KE=15
means we record the regression error of the positive sample at 20%
. So choosing KE
as 8 &10
means the percentage ratio is around 10%/13%
.I hope this can solve your problem.
Thanks for your question. Sorry for the late reply since I was a little busy these days.
- Using the mean offsets of
x&y
is just one simple choice. From Figure 2 in the paper, the distribution ofw&h
are also similar, so we just usex&y
for simplicity.- The definition of
regression error
is the offset from the candidate box to the target ground truth box. So I have no idea where is the problem.KE
can be viewed as a percentage ratio of the positives. For example, ifKI=75
, selectingKE=15
means we record the regression error of the positive sample at20%
. So choosingKE
as8 &10
means the percentage ratio is around10%/13%
.I hope this can solve your problem.
Thanks for your reply. But I am still puzzled about the definition of regression error.
In the paper DynamicRCNN
, in Fig.4
the paper analyzes the relationship between regression error and gradient
, the abscissa is regression error
which denotes the x
in SmoothL1
, but the implement of SmoothL1
in
https://github.com/hkzhang95/DynamicRCNN/blob/de62c3a4c3c131da9679dbda689d37edbdaaa5a1/dynamic_rcnn/det_opr/loss.py#L15
For my understanding, we compute the abs
of the distance of input
and target
. So for the abscissa regression error
, it should be the error of predict offset and target offset instead of the offset from the candidate box to the target ground truth box.
Otherwise, if we use the offset from the candidate box to the target ground truth box as regression error
, it won't be changed actually, it always a fixed number because the offset from the candidate box to the target ground truth box
don't change for the same image, it would be always related to the offset of targets, not related to the predicted offsets. The dynamic beta
won't be really dynamic, it just the offset of targets with settled value before. So, in training process, it just changes the image combination of the different batch to get different beta
.
Sorry for the wrong comment. Actually, I have noticed the misleading words and changed the expression from regression error
to regression label
in Figure 2. However, I forgot to change some other similar expressions in the paper, I will fix this bug in the next version.
Just for clarification (we use regression label
to update beta
):
regression error
is the error between the predicted offset and the target offsetregression label
is the target offset (termed as regression target
in some papers), which can reflect the localization accuracyMoreover, if we use the offset from the candidate box to the target ground truth box
to update beta
, it is also dynamic since in the second stage the candidate box
is the proposal which is not fixed.
Thanks for the work DynamicRCNN, I have read the paper "Dynamic R-CNN: Towards High Quality Object Detection via Dynamic Training", for dynamic Dynamic SmoothL1 Loss, when calculating the
regression errors
, the source code ishttps://github.com/hkzhang95/DynamicRCNN/blob/de62c3a4c3c131da9679dbda689d37edbdaaa5a1/models/zhanghongkai/dynamic_rcnn/coco/dynamic_rcnn_r50_fpn_1x/network.py#L166
But I have some doubts: First, why does it use the mean offsets
x
andy
instead of using the mean offsets ofx
,y
,w
,h
to calculate theregression errors
? What's more, in the paper DynamicRCNN, it mentioned use regression error to updatebeta
, but when I read the source code, it just uses the ground truth of offsets, it seems unreasonable.Second, the
hyper-parameter
MODEL.DYNAMIC_RCNN.KE
choose the number8
10
15
, Are there some reasons? Does it relate to the anchor number or some other parameters? @hkzhang95