tensorflow / tensor2tensor

Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
Apache License 2.0
15.58k stars 3.51k forks source link

Absolute Position Encoding:Why are the two tensors not alternately merged? #1925

Closed davinca closed 1 year ago

davinca commented 1 year ago

https://github.com/tensorflow/tensor2tensor/blob/bafdc1b67730430d38d6ab802cbd51f9d053ba2e/tensor2tensor/layers/common_attention.py#L453

In the orginal paper, the position_embedding is like this: [..., sin i, cos i, ...]

martinpopel commented 1 year ago

See #177 and #1591 (and #1677).

davinca commented 1 year ago

just different orderings of the same set of channels, The effects of both are consistent theoretically.