Closed LIngerwsk closed 3 years ago
Hi, as you can see in this line, the decoder layer takes as input the current decoding vector (decoding
), as well as the latent vector (here called encoding
), which is the output of the encoder layers.
thank you for your answer, but I can't find the embedding layer in your decoderlayer. Like this I only find the embedding of the encoderlayer in
encoding = self._embedding(x)
I have omitted the output embedding layers, as we are not working with language models. It's addition and use in the original paper is inspired by Ofir 2017, I considered it was irrelevant for our time series problem.
Hi @maxjcohen, my understanding is since here we are working with time series forecasting, the input to the decoder layer (here called decoding
) will be same input which was fed to encoding
, along with the as well as the latent vector (here called encoding
), which is the output of the encoder layers ?
This Transformer was built as a traditional Encoder-Decoder model: the encoder takes as input data for times t=1:T
, produces a latent vector with dimensions (T, d_emb)
which is fed to the decoder. The output of the decoder are predictions for times t=1:T
. This is not a forecasting problem as much as a regression problem for time series.
To answer your question, the output of the encoder is the encoding
tensor, that is directly fed to the decoder.
Thank you @maxjcohen for the quick response. This is helpful 😄
Hi,here I come again. The decoderlayer of original transformer have the input both of encoderlayer and y. But in your transformer, I only find that the decoderlayer have the input of encoderlayer, why?