Closed ollmer closed 6 years ago
If my understanding is correct, using a 1D convolution is the same as taking a dot product between the original matrix (say of dimensions n_timesteps x d) and a matrix of dimension d x n_filters.
The codebase uses matmul when the receptive field size is 1. I originally thought conv1d would do this automatically "under the hood" but that does not appear to be the case.
Hi! I've noticed that the training code using 1d convolution with kernel size 1 in all invocations. Do we need convolution at all here? Why not replace it with the fully_connected layer?