Some questions about Focal loss(alpha_t)

CasiaFan / tensorflow_retinanet

RetinaNet with Focal Loss implemented by Tensorflow

121 stars 45 forks source link

Some questions about Focal loss(alpha_t) #2

Open yzldw333 opened 6 years ago

yzldw333 commented 6 years ago

alpha_t = tf.where(tf.equal(onehot_labels,1.0),alpha_t,1-alpha_t) Hi, today I see the paper about Focal Loss, and I didn't see anything formula about alpha_t, I saw you write it as above, in my opinion this formula has no meaning because alpha_t which becomes 1-alpha_t has to multiply zero. Am I right?

CasiaFan commented 6 years ago

@yzldw333 In part 3.1 of focal loss paper, the author says:

For notational convenience, we define αt analogously to how we defined pt

Before this line, alpha_t = tf.scalar_mul(alpha, tf.ones_like(onehot_labels, dtype=tf.float32)) will be performed first. so alpha_t is alpha which equals 0.25 by default and 1-alpha_t is 0.75.

BTW, this project is still under development, so some bugs are still not found. Any issues are greatly welcome!

yzldw333 commented 6 years ago

我用中文啦，我尝试把这个加入视频3d卷积训练的过程中，发现收敛速度貌似变快了。

CasiaFan commented 6 years ago

@yzldw333 这个loss对于容易训练和难训练样本之间比例偏差很大的时候会很有用，会把难训练样本造成的影响放大，因此收敛的速度确实会变快

dongdongrj commented 6 years ago

@CasiaFan @yzldw333 ,我也用中文，这个demo可以用CPU跑吗？

CasiaFan commented 6 years ago

@dongdongrj 这个demo目前还是有bug的，所以你可以先用其他库类似keras-retinanet先代替。

还有不建议用CPU跑，检测模型只用CPU计算还是比较慢的

dongdongrj commented 6 years ago

@CasiaFan 感谢，能否加微信，交流学习。微信号：huangdd_521

zzzzzz0407 commented 6 years ago

我也感觉alpha_t的设置好像并没有起作用，本身（1-alpha_t）的目的是为了扩大负样本前面的系数，但是你这边通过矩阵运算以后，得到的loss再乘这个系数以前，只有对应的样本（one_hot)那个位置会存在Loss值，其余位置已经是0了，然后乘以你这个系数，one_hot位置那边所有值肯定是alpha_t，所以根本无法区分正样本的alpha_t和负样本的1-alpha_t