need clarification about paper and implementation

locuslab / TCN

Sequence modeling benchmarks and temporal convolutional networks

https://github.com/locuslab/TCN

MIT License

4.13k stars 875 forks source link

need clarification about paper and implementation #53

Closed HilmiiKumdakci closed 4 years ago

HilmiiKumdakci commented 4 years ago

Hi,

As it is written in the paper, input and output should be the same length and output at time t depends on previous values of input. When I look at the implementation of adding problem, I see that input is 2*T where T is 200,300,400 etc. However output is just a scalar. What is the explanation for this ?

jerrybai1995 commented 4 years ago

Hi,

The input is represented as a Tx2 tensor; in the output layer (which has length T), we only take the last time step to produce a scalar.

HilmiiKumdakci commented 4 years ago

Thanks for your explanation.

What I do not understand is, number of classes is 1, does not that mean output length is equal to 1 ? If not so, where do you take last time step to produce scalar value of adding problem?

I would expect something like seq_length sized array with full of zeros and sum value appended into that as a ground truth. By matching size with that I would add one zero to the end of input as well.

jerrybai1995 commented 4 years ago

The number of classes is one means the width (the # of channels is 1), and the length is 1. But a TCN outputs a tensor with the same length as the input. There we only take the last time step of this full-length output, using x[:,:,-1].