xiaoboCASIA / SV-X-Softmax

185 stars 20 forks source link

do i need to modify softmax function? #3

Open maryhh opened 5 years ago

maryhh commented 5 years ago

你好,我结合arcface修改了fc7层,请问softmax 还需要修改吗? hi,i modified fc7,do i need to modify softmax function?

new_zy = mx.sym.where(cond, new_zy, zy_keep)#>0 s*cos(@+m), <0 cut
diff = new_zy - zy#为了将真实的标签位置替换为new_zy
diff = mx.sym.expand_dims(diff, 1)

new_zy = mx.sym.expand_dims(new_zy, 1)
gt_one_hot_down = mx.sym.one_hot(gt_label, depth = args.num_classes, on_value = 0.0, off_value = 1.0)
gt_zy = mx.sym.broadcast_mul(gt_one_hot_down,fc7)#求非gt的Cos值

gt_greater = mx.sym.broadcast_greater(gt_zy,new_zy)#得到非gt值比gt值大的标签索引
gt_lesser_than = mx.sym.broadcast_lesser_equal(gt_zy,new_zy)#得到非gt值小或等于gt值的标签索引
gt_greater_mul = gt_greater*t#非gt位置*t
gt_greater_mul = mx.sym.broadcast_add(gt_greater_mul,gt_lesser_than)#再加上gt_lesser_than,为了与gt_zy相乘
fc7 = mx.sym.broadcast_mul(fc7,gt_greater_mul)#对应位置乘t
gt_greater_add = gt_greater*(t-1)#对应位置变为t-1
fc7 = mx.sym.broadcast_add(fc7,gt_greater_add)#对应位置加上t-1   

gt_one_hot = mx.sym.one_hot(gt_label, depth = args.num_classes, on_value = 1.0, off_value = 0.0)
body = mx.sym.broadcast_mul(gt_one_hot, diff)
fc7 = fc7+body  
david-di commented 5 years ago

"请问softmax 还需要修改吗" 什么意思呀?是指fc7后面的 SoftmaxOutput ?不用改 不过,你上面贴的实现跑起来正常么?我是mxnet新手,写的和你不太一样。。

maryhh commented 5 years ago

"请问softmax 还需要修改吗" 什么意思呀?是指fc7后面的 SoftmaxOutput ?不用改 不过,你上面贴的实现跑起来正常么?我是mxnet新手,写的和你不太一样。。

好滴, 能跑,但是效果不好,你要不贴出来看看,我也挺新的..

david-di commented 5 years ago

"请问softmax 还需要修改吗" 什么意思呀?是指fc7后面的 SoftmaxOutput ?不用改 不过,你上面贴的实现跑起来正常么?我是mxnet新手,写的和你不太一样。。

好滴, 能跑,但是效果不好,你要不贴出来看看,我也挺新的..

我的才开始练,目前看测试集准确率趋势应该对的,至少不比 arcface 差。我先看看效果哈,没问题了再贴,防止误导...... 直接在 fc7=fc7+body 后面加了7行,估计是我没看懂你的代码,所以觉得不一样。。。

david-di commented 5 years ago

@maryhh 你贴的代码应该是按照这个思路写的: 支持向量部分: t cos(\theta) + (t-1) 其他部分:cos(\theta) 你先把 t 乘了上去,然后加了 t-1。 似乎有个漏洞,不知道是我没理解还是确实有漏洞 完整公式应该是 s (t * cos(\theta) + (t-1)), fc7 里面已经乘了 s,所以乘 t 的部分没问题;但是加 (t-1) 的时候是不是忘记乘 s 了。这样导致难样本可能出现负的加成。

haoxintong commented 5 years ago

我用mxnet-gluon实现了一下, 欢迎试用, 代码链接, 支持SV-Sphere/Arc/AM.

ysc703 commented 5 years ago

@haoxintong Nice work! It seems that it has been merge to gluon-face!

haoxintong commented 5 years ago

@ysc703 这个就是gluon-face的日常更新, 只不过最近没有GPU资源了, 所以没做完整训练来验证.

fuxuliu commented 4 years ago

@david-di hi, I implement a version, but it seems that there are some bugs.. If you dont't mind, could you share the SV-X-Softmax mxnet version code here? thank you so much.

david-di commented 4 years ago

@Gary-Deeplearning Sorry, I could not release any source code on the internet due to regulations of my company. Maybe you could paste your code here, and I will try to review it.

fuxuliu commented 4 years ago

@david-di oh,sure. could you share the email here, I will try to contact you with email.

fuxuliu commented 4 years ago

@david-di hi, I just pasted my code here, please check if there are any errors.

...... fc7 = fc7+body cos_theta_m = mx.sym.pick(fc7, gt_label, axis=1) cos_theta_m = mx.sym.expand_dims(cos_theta_m, 1) cos_theta_m = mx.sym.broadcast_mul(mx.sym.ones_like(fc7), cos_theta_m) mask = fc7 < cos_theta_m
hard_vector = mask * fc7

hard_vector = (temp + 1.0) * hard_vector + temp

hard_vector = temp hard_vector + s(temp - 1.0) hard_vector = hard_vector mask fc7 = fc7 gt_one_hot + hard_vector

david-di commented 4 years ago

@Gary-Deeplearning I think the last line of your code has a mistake, gt_one_hot should be the indication of ground truth, while what you need is a indication of all non_hard instances. I think the last two lines could be replaced with following line.

fc7 = fc7 * (1 - mask) + hard_vector * mask

By the way, your code could be optimized on compution load.

fuxuliu commented 4 years ago

@david-di Oh, my. I realized that, it was a error, I will try that later. thank you, yes, it was dirty, I just write that with some minutes.

konioyxgq commented 4 years ago

I think you are wrong. Can you post the code above “fc7 = fc7+body”, to verify my thoughts or to reject my ideas? @Gary-Deeplearning @david-di

konioyxgq commented 4 years ago

I'm not sure we're both having the same code above "fc7 = fc7+body" @Gary-Deeplearning @david-di