What does' timestep 'in the input dimension mean?

qq492947833 commented 2 years ago

I'm really sorry. I'm a novice in machine learning. When I use TCN, the input dimension contains' timestep '. I can't understand the meaning of this dimension. In my experience, 'translations' and' kernel Size 'these two parameters should control the prediction step.For example,if 'definitions' is [1,2,4], while' kernel size 'is 2, it means that the first 1-7 variables entered are used to predict the current variable. What is the role of' timestep 'in the input dimension? If I want to predict the current variable through the first 1-7 input variables, should 'timestep' be filled in as 8? If you can answer, I will be very grateful!

实在抱歉，我是机器学习领域的新手。当我使用TCN时，输入维度中含有'timestep'，我不能够理解这个维度的意思。根据我的经验，'dilations'和'kernel_size'这两个参数应该控制着预测步长，比如'dilations'为[1,2,4]，而'kernel_size'为2，那么就代表使用输入的前1-7个变量来预测当前变量。那么输入维度中的'timestep'的作用是什么呢？假如我想通过前1-7个输入变量来预测当前变量，那么'timestep'应该填写为8吗？如果您能够解答，我将非常感谢！

philipperemy commented 2 years ago

@qq492947833 refer to this tutorial: https://www.tensorflow.org/guide/keras/rnn.

Schematically, a RNN layer uses a for loop to iterate over the timesteps of a sequence, while maintaining an internal state that encodes information about the timesteps it has seen so far.

Timesteps is the time dimension.

Any recurrent layer in Keras expects:

A 3D-tensor (batch_size, time_dim, input_dim) as input.
Will output (batch_size, output_dim) if return_sequences=False or (batch_size, time_dim, output_dim) if return_sequences=True.

Here is a bit more detail about each variable:

batch_size is configurable. It's how many sequences you will batch together to perform a single gradient update. Usually 32/64/128/256/512.
time_dim or timesteps is the number of steps you have in your sequences.
input_dim is the dimension of each data point for any given step.

You don't need to care about how the TCN works internally, especially if you are a novice in ML. You can easily switch between TCN, LSTM and GRU.

qq492947833 commented 2 years ago

Thank you very much for your answer,I have one last question: Does a receptive field in TCN repeat with the timesteps? Can TCN with a timesteps of 8 and A TCN with a receptive field of 8 get the same result (because there is no concept of receptive field in LSTM and RNN)?

非常感谢您的回答，我还有最后一个问题：TCN中的感受野是否和timesteps重复？timesteps为 8 的 TCN 和感受野为 8 的 TCN 能否获得相同的结果（因为在 LSTM 和 RNN 中没有感受野的概念）？

philipperemy commented 2 years ago

@qq492947833 in LSTM and RNN, there is the concept of memory. The bigger the LSTM/RNN is, the more memory it will have. Which means the further in the sequence it can remember. But most LSTM/RNN can't remember more than 50 steps back in time (by construction, the gradient propagation somehow vanishes a bit at every step, look for BPTT, a gradient-based technique for training certain types of recurrent neural networks).

With TCN you can control how far the network can see in time. It's called the receptive field. And because TCN has dilated convolutions, the information coming far in the sequence can be recovered very quickly.

In practice your receptive field should always be bigger than the length of your biggest sequence. Otherwise the TCN will not see the entire sequence to make its prediction.

The number of timesteps is defined by your data. It's given by your input_shape. The TCN does not have control on that. The receptive field is given by the input parameters of your TCN (in the init) and you can control them.

qq492947833 commented 2 years ago

I am very grateful to you！I wish you a smooth scientific research and a happy life!

我非常感激你！祝你科研顺利，生活愉快！

philipperemy commented 2 years ago

@qq492947833 likewise thank you very much!

philipperemy / keras-tcn

What does' timestep 'in the input dimension mean? #227