transformer (forward pass) #10

slavavs82 commented 4 years ago

Parameters: | x (Tensor) – torch.Tensor of shape (batch_size, K, d_input).

What is K?

maxjcohen commented 4 years ago

Hi, K is the time dimension. In the case of the data challenge (see #2 ), we work on full months with hourly data, so K = 24 * 28 = 672. In the last examples on this repo, I've switch to weekly predictions, so K = 24 * 7 = 168.

slavavs82 commented 4 years ago

Thank you so much for the answer! I'll clarify, as my English prevents me from understanding correctly. K is the window size? d_input is the data size inside the window?

maxjcohen commented 4 years ago

No, K is the time dimension, for example the number of hours in X. d_input is the dimension of the input X for each time step. For instance, if you had 2 inputs (say outdoor temperature and ac schedule) for an entire week (168 hours), X shape would be (1, 168, 2).

We refer to the window size as "attention_size", see here.

slavavs82 commented 4 years ago

Fantastic, but I still don't get it :) My data set is very simple (for the test). series = np.sin(np.arange(0, 1000)) I'm breaking this list down into 30 size windows. I want to pass a window sized 30 to the network and predict the next 5 values. Let the Batch size be 1. What size will x have? What would equal a K?

maxjcohen commented 4 years ago

In your case:

You don't have to break the list into size windows, the transformer will do it for you using the score, see this line of the MHA block.

slavavs82 commented 4 years ago

Dude, I really appreciate it. Give me one last thing. How do I get a 30 size window online and predict the next 5? I'm thinking logically. I have a data set. I want to take some data and teach the network to predict the next data that the network can't see.

slavavs82 commented 4 years ago

I think I misled you myself. series = np.sin(np.arange(0, 1000)) It's not a sample. It's a 1,000 size dataset. I want to break this dataset down into samples. Size sample = 30. That's what I called a window. Then I can combine these samples into batch=4 (for example). So, I have batch = 4, K = 30, d_input = 1. Is that so?

maxjcohen commented 4 years ago

Ok I think I see where you're going. In that case, your values for batch_size, K and d_input should be good. Unfortunately, the Transformer isn't well fit to predict future states, as it predicts one single output for each input. We discuss this in #5 .

slavavs82 commented 4 years ago

In this case, there is a question. I submit sample=30(values) to the encoder input. If I want to predict the next 10 values, I submit these 10 values to the decoder input. I have seen this scheme in this document Look at figure 1. Encoder input = T1,T2,T3,T4; decoder input T4,T5. In my case I want to feed Encoder input = T1,T2...T30; decoder input T30,T31...T40. Can I do that?

pqy000 commented 4 years ago

Of course, you can try it ?

Lisa-FFY commented 3 years ago

Excuse me,could you please tell me is there any relevant paper about this code?I want to study it in depth.

shathaa1983 commented 2 years ago

Hi, I am trying to use a univariate time series dataset. I got this error:

I'd appreciate it if you let me know if your code is suitable for the univariate time series. And how to solve this error? I used the code in this link:


maxjcohen commented 2 years ago

Hi, this seems to be an issue with Pandas, as you can see in the last stack of the Traceback. Did you try to feed a pandas Dataframe directly to the trainer ? You most likely need to modify the dataloader class to match your dataset.

In the future, please open a new issue when discussing new/different problems. Thanks !

shathaa1983 commented 2 years ago

Sorry, I am new to GitHub. Do you have any idea how to modify the data loader class?

maxjcohen commented 2 years ago

I'll answer on the new issue :p closing this one as there is no longer any activity.