Closed mathematicsofpaul closed 3 years ago
Hi Max,
I think I am tackling a very similar problem as Paul does. I was wondering if you could briefly answer the questions above.
Thank you in advance.
Hey guys, half of these questions are answered in the doc.
K
. We are working with multivariate data for both input and output.The embedding layer is replaced by a generic linear layer ;
None
(nothing is added), "original"
(same as original paper), and "regular"
, which is a sinusoidal function with a day/night period (see code here). This last one is the only one I've developed.I currently don't have the time to detail either the doc or the readme, feel free to send a PR if you have some ideas !
For your implementation, what were the dimensions of your input and output? Or in other words, can I input a mutlivariate time series input into this model?
And if i do, should i expect a multivariate output. ie input dimensions
[4,10]
and get output dimensions[3,10]
?If the inputs where multivariate, I assume that you used a neural network to embed that "information" into some kind of space analogous to word space embeddings?
In the documentation, you could say more about what type of positional embedding you did use? You mention "A "regular" version", could you provide more detail?
"A window is applied on the attention map to limit backward attention, and focus on short term patterns." Since encoders and decoders can have different mask, where specifically did you apply the window?