input problem. - Githubissues

maxjcohen / transformer

Implementation of Transformer model (originally from Attention is All You Need) applied to Time Series.

https://timeseriestransformer.readthedocs.io/en/latest/

GNU General Public License v3.0

852 stars 166 forks source link

input problem. #38

Closed LIngerwsk closed 3 years ago

LIngerwsk commented 3 years ago

Hi,here I come again. The decoderlayer of original transformer have the input both of encoderlayer and y. But in your transformer, I only find that the decoderlayer have the input of encoderlayer, why?

maxjcohen commented 3 years ago

Hi, as you can see in this line, the decoder layer takes as input the current decoding vector (decoding), as well as the latent vector (here called encoding), which is the output of the encoder layers.

LIngerwsk commented 3 years ago

thank you for your answer, but I can't find the embedding layer in your decoderlayer. Like this I only find the embedding of the encoderlayer in

encoding = self._embedding(x)

maxjcohen commented 3 years ago

I have omitted the output embedding layers, as we are not working with language models. It's addition and use in the original paper is inspired by Ofir 2017, I considered it was irrelevant for our time series problem.

shivaniarbat commented 3 years ago

Hi @maxjcohen, my understanding is since here we are working with time series forecasting, the input to the decoder layer (here called decoding) will be same input which was fed to encoding , along with the as well as the latent vector (here called encoding), which is the output of the encoder layers ?

maxjcohen commented 3 years ago

This Transformer was built as a traditional Encoder-Decoder model: the encoder takes as input data for times t=1:T, produces a latent vector with dimensions (T, d_emb) which is fed to the decoder. The output of the decoder are predictions for times t=1:T. This is not a forecasting problem as much as a regression problem for time series.

To answer your question, the output of the encoder is the encoding tensor, that is directly fed to the decoder.

shivaniarbat commented 3 years ago

Thank you @maxjcohen for the quick response. This is helpful 😄