代码实现的疑问？

SmartLi8 / stella

text embedding

Apache License 2.0

134 stars 6 forks source link

Open amulil opened 11 months ago

amulil commented 11 months ago

这里 * in_batch_ratio 的目的是什么，cross_entropy 结果不会改变啊，我理解作者原始的想法是想给负例缩放，但是这里正负一起缩放，相当于没有缩放？

DunZhang commented 10 months ago

@amulil 你好，in_batch_ratio是缩放得分的，方便拉开差距，那不然会难收敛。根据论文及个人经验，bsz越大，in_batch_ratio就要越大，可以参考e5的report

HuXiLiFeng commented 3 weeks ago

是类似于温度系数的作用吗