maxjcohen / transformer

Implementation of Transformer model (originally from Attention is All You Need) applied to Time Series.
https://timeseriestransformer.readthedocs.io/en/latest/
GNU General Public License v3.0
842 stars 165 forks source link

The input dimension??? #57

Closed YangJoy750 closed 2 years ago

YangJoy750 commented 2 years ago

if my input dimension is 3, which means x.shape = [a,b,c] and my output is [a], what changes I need to do to fit the data?

maxjcohen commented 2 years ago

Hi, please see the documentation for information regarding the shape of input and output tensors, as well as the examples in the repo.

YangJoy750 commented 2 years ago

yeah, i have seen it, but when I fit my own data, it has some problems about it. Could you tell me the input and output dimension of original data?(I can't download it, so sad), thanks a lot

maxjcohen commented 2 years ago

As written in the docs, the shape of the input tensor should be (batch_size, K, d_input) (where K is the number of time steps), and the output tensor should be (batch_size, K, d_output).

YangJoy750 commented 2 years ago

Actually, my input shape is [3000,20,5], and output shape is [1]. I try to use the model in time series prediction. [3000,20,5] means I have 3000 sequences, which have 5 features, and the length of each sequence is 20. and I want to predict the tomorrow data(so that the output shape is 1), what should I change to fit?

maxjcohen commented 2 years ago

The transformer, at least the implementation of this repo, focuses on sequence to sequence predictions. In other terms, it's probably not a good fit for your data ; I would take a look at more traditional architectures such as RNNs if you didn't already. However, if you really wish to use this transformer, you could simply add a fully connected layer following the Transformer model, in order to output a single value as in your data.

YangJoy750 commented 2 years ago

oh, sorry about that, the output dimension is [3000], one sequence predicts a single value finally, can I regard each output is a sequence which length is 1, so I can use the model?

maxjcohen commented 2 years ago

This Transformer expects both input and output sequences to have the same number of time steps. But again, you can always add a fully connected layer after the Transformer layer in order to predict a single value for each sequence.

YangJoy750 commented 2 years ago

I see, thanks a lot!

YangJoy750 commented 2 years ago

by the way, if my input shape is [3000,20,5] like above explanation, the batch size is 3000, K is 20 and d_input is 5(number of features)?

maxjcohen commented 2 years ago

Yes this is correct.