CasiaFan / tensorflow_retinanet

RetinaNet with Focal Loss implemented by Tensorflow
121 stars 45 forks source link

Some questions about Focal loss(alpha_t) #2

Open yzldw333 opened 6 years ago

yzldw333 commented 6 years ago

alpha_t = tf.where(tf.equal(onehot_labels,1.0),alpha_t,1-alpha_t) Hi, today I see the paper about Focal Loss, and I didn't see anything formula about alpha_t, I saw you write it as above, in my opinion this formula has no meaning because alpha_t which becomes 1-alpha_t has to multiply zero. Am I right?

CasiaFan commented 6 years ago

@yzldw333 In part 3.1 of focal loss paper, the author says:

For notational convenience, we define αt analogously to how we defined pt

Before this line, alpha_t = tf.scalar_mul(alpha, tf.ones_like(onehot_labels, dtype=tf.float32)) will be performed first. so alpha_t is alpha which equals 0.25 by default and 1-alpha_t is 0.75.

BTW, this project is still under development, so some bugs are still not found. Any issues are greatly welcome!

yzldw333 commented 6 years ago

我用中文啦,我尝试把这个加入视频3d卷积训练的过程中,发现收敛速度貌似变快了。

CasiaFan commented 6 years ago

@yzldw333 这个loss对于容易训练和难训练样本之间比例偏差很大的时候会很有用,会把难训练样本造成的影响放大,因此收敛的速度确实会变快

dongdongrj commented 6 years ago

@CasiaFan @yzldw333 ,我也用中文,这个demo可以用CPU跑吗?

CasiaFan commented 6 years ago

@dongdongrj 这个demo目前还是有bug的,所以你可以先用其他库类似keras-retinanet先代替。

还有不建议用CPU跑,检测模型只用CPU计算还是比较慢的

dongdongrj commented 6 years ago

@CasiaFan 感谢,能否加微信,交流学习。微信号:huangdd_521

zzzzzz0407 commented 6 years ago

我也感觉alpha_t的设置好像并没有起作用,本身(1-alpha_t)的目的是为了扩大负样本前面的系数,但是你这边通过矩阵运算以后,得到的loss再乘这个系数以前,只有对应的样本(one_hot)那个位置会存在Loss值,其余位置已经是0了,然后乘以你这个系数,one_hot位置那边所有值肯定是alpha_t,所以根本无法区分正样本的alpha_t和负样本的1-alpha_t