Closed igo312 closed 3 years ago
| | means stop gradient operation rather than abs. Why not check the code by yourself, so you can undetstand quickly.
For a vector, |u| is a scalar, it only retains the amplitude and no direction.
Thank you for your response.
But the code is very different from what paper said. And there may be a misunderstanding.I get the IoU should be the lr step when updating the weight while the u/|u|
be the direction.
However the gradient is a little weird as I calculate it.
You see, if regard |u|
as scaler then the gradient as shown in the picture below, the |u|
will effect the lr step. In other word the gradient of part u/|u|
actually is not 1 or -1.
In any case, my core idea is to let u only determine the back propagation of the gradient, |iou|/|u| to eliminate the discontinuity of loss.
It is still a little weird because when the |u| is bigger which means the loss become bigger, but once |u| become the denominator, The function of |u| and |IoU| are contradictory.
For exmaple one box is far away from the groundbox, both the value of |F(Iou)|(like |-logIoU| ) and |u| will be bigger.
It's strange for me, what do you think?
Please allow me to explain in Chinese.
你这种情况是非常少的,因为我们在选取正样本的时候是基于IoU>0.5的样本的。在非边界情况下,|F(Iou)|和|u|的变化趋势不应该本来就是相同的吗?
能用中文就太好啦。
就是在考虑正常的情况呢(抱歉我没有从边界角度去考虑), 变化趋势一致所以是一件很奇怪的事。
当框相对GT偏移变大的时候,应该希望更新步长要大一些。但是当趋势相同,两者是相矛盾,这就将造成了当偏移变大,步长增长的不快,甚至在某些情况会存在步长减小的情况,也就是这个scale并不稳定。
你觉得呢?
那你还要考虑u这个量吧,所以用l2损失可能会比较合适,求导的时候会把1/|u|消去只剩下|f(iou)|。我最开始的意图就是在边界下把u的大小给消去,但是按你这么一说用smooth L1可能不太合适,当值比较大的时候smooth l1就是L1了。
我好像都是从梯度角度去理解了,我需要一点时间去理解你在边界角度考虑设计这个函数的意图。
所以用l2损失可能会比较合适...
但是L2从反传角度上也并不合适,因为会在分母上留下|y-y'|
,我觉得直接把|u|去掉就行了,用L1loss,这样反传就只有1或-1,或者可以在反传的程序中引入一个符号函数,这样子无论选用什么loss,回传都是正负1
@yangxue0827 I do have a question in this regard also:
From my understanding tf.stop_gradient() prevents the gradient from being calculated. Does that not also prevent the weights from being updated?
Best regards and thank you!
here's the relative link
In the link I say the backward gradient will be 0 eternally.
In other point, make the
|u|
underivable that gradient will not be 0. But the gradient ofu/|u|
is not 1 anymore.@yangxue0827 Could you please help me out? Many thanks!