keunwoochoi / keras_STFT_layer

Do STFT in Keras
MIT License
63 stars 9 forks source link

transpose() in 'stft.py' #4

Closed BigeyeDestroyer closed 7 years ago

BigeyeDestroyer commented 7 years ago

Hi, I'm wondering whether the code in line 56 and 57 of 'stft.py' is correct: dft_real_kernels = dft_real_kernels[:nb_filter].transpose() dft_imag_kernels = dft_imag_kernels[:nb_filter].transpose() Since you wanna output filters with size (nb_filter, 1, 1, n_win), I think there is no need for transpose here.

keunwoochoi commented 7 years ago

That makes sense but the code is working correctly. Probably the docstring is not correct? As you pointed out, the shape of returned values is (n_win, 1, 1, nb_filter).

BigeyeDestroyer commented 7 years ago

Yeah, the code actually outputs shape with (n_win, 1, 1, nb_filter). But if I have not misunderstood, you are trying to convert the matrix multiplication to convolution operation, so the convolutional filters should be in the form as (num_filters, input_channel, filter_height, filter_width), since nb_filter actually corresponds to num_filters here, it ought to be the first dim of the shape. Maybe since I am not really familiar with keras, I will study the code more carefully.

keunwoochoi commented 7 years ago

I checked it out again. Theano backend.

model = keras.models.Sequential()
model.add(keras.layers.Convolution1D(nb_filter=10, filter_length=128, input_shape=(44100, 1)))
model.layers[0].weights[0].shape.eval()
# array([128,   1,   1,  10])

The shape seems (filter_length, 1, 1, nb_filter).

BigeyeDestroyer commented 7 years ago

Thank you so much for checking the code, I also checked the source code of keras, line 115 of convolutional.py is self.W_shape = (self.filter_length, 1, input_dim, self.nb_filter), just has the same format with your codes. The dimensions' order of the conv weights is different from that of lasagne which I use most of the time.

keunwoochoi commented 7 years ago

Thanks!

BigeyeDestroyer commented 7 years ago

Welcomed~