shenweichen / DeepCTR

Easy-to-use,Modular and Extendible package of deep-learning based CTR models .
https://deepctr-doc.readthedocs.io/en/latest/index.html
Apache License 2.0
7.54k stars 2.21k forks source link

sequence.py transformer 关于多个 layer_norm 参数 name 重复的问题 #496

Closed BlackcOVER closed 1 year ago

BlackcOVER commented 1 year ago

transformer 有几处用了layer_norm self.att_ln_q = LayerNormalization() self.att_ln_k = LayerNormalization() self.ln = LayerNormalization() 默认 center=True, scale=True 这种情况下,会复用同一组 gamma 和 beta 参数,这个是否是有问题的?每个 layer_norm 是否应该学习不同的参数?

shenweichen commented 1 year ago

这些layer分别是独立的,不会存在共用参数的情况。