paul-krug / pytorch-tcn

(Realtime) Temporal Convolutions in PyTorch
MIT License
55 stars 8 forks source link

Regression or Classification head #12

Closed samshipengs closed 5 months ago

samshipengs commented 5 months ago

If I'd like to have (non-causal) a classification say 10 classes or regression 1 or multiple outputs, how should I initialize the TCN class or modify the network?

The final output shape from my input data: (batch_size, C, T) gives ( batch_size, Encoded channel dimension depends on the num_channels I used, T), I'd like something similar to a regular fully convolution network where it gradually reduces the spatial and channel dimensions that at last connects to a classification or regression head.

What I try to do is simply taking the last time step of the output, but isn't all the calculation before the last time step all wasted? i.e. the upper left triangle, if my understanding is right. Can't we down-sample the temporal dimension in the encoding phase?

paul-krug commented 5 months ago

If I understand correclty, you would like to map timeseries data to a fixed label-vector. Typically, you would do that by pooling the TCN output. The easiest way would be the following:

x = TCN(x) x = torch.mean(x, axis=time_axis) x = softmax(x)

A more complex way would be to use several TCN layers and in between those, you would use conv layers with strided convolutions to downsample the temporal dimension. However, that only works for fixed sized inputs. If you have sequences of arbitrary length you cannot avoid a global average in the end.