maxjcohen / transformer

Implementation of Transformer model (originally from Attention is All You Need) applied to Time Series.
https://timeseriestransformer.readthedocs.io/en/latest/
GNU General Public License v3.0
842 stars 165 forks source link

Position Encoding #62

Closed Sharp-rookie closed 1 year ago

Sharp-rookie commented 1 year ago

The exponential part of the second term should start at 0, and 1000 should be changed to 10000.

tst/utils.py

def generate_original_PE(length: int, d_model: int) -> torch.Tensor:

...

pos = torch.arange(length).unsqueeze(1)

PE[:, 0::2] = torch.sin(pos / torch.pow(1000, torch.arange(0, d_model, 2, dtype=torch.float32)/d_model))

PE[:, 1::2] = torch.cos(pos / torch.pow(1000, torch.arange(1, d_model, 2, dtype=torch.float32)/d_model))
maxjcohen commented 1 year ago

Hi, thanks for noticing, I've fixed the implementation. These positional encodings from the original paper where not used in practice in our experiments.