shenweichen / DeepCTR

Easy-to-use,Modular and Extendible package of deep-learning based CTR models .
https://deepctr-doc.readthedocs.io/en/latest/index.html
Apache License 2.0
7.54k stars 2.21k forks source link

Is values calculate way right in layers.sequence.transformer?(Transformer里似乎V的计算有一点点问题吧) #490

Closed darryyoung closed 1 year ago

darryyoung commented 2 years ago

As we know, in transformer the major part is Q,K,V. This prosess is always generate from queries' input and sequences' input. I'm wondering the following two lines whether right in deepctr.layers.sequence.Transformer (line 534 and 535):

keys = tf.tensordot(keys, self.W_key, axes=(-1, 0))
values = tf.tensordot(keys, self.W_Value, axes=(-1, 0))

after given tensordot with keys and self.W_key to keys, keys is already changed. Then use keys to calculate values, the values is become input self.W_key self.W_Value, but I think what we need is input self.W_Value. 这里计算keys时候的输入keys是层的输入,然后就覆盖了吧,我的理解是,value的值变成了:input W_key W_Value,而真正应该用的是inputW_value

Should we change the order of these two lines? 这两行代码的顺序应该换一下吧:

values = tf.tensordot(keys, self.W_Value, axes=(-1, 0))
keys = tf.tensordot(keys, self.W_key, axes=(-1, 0))

if it's a thick in ctr predict, or my opinion is wrong, pls advise.