Open qideng7 opened 4 years ago
Hi @qideng7, it doesn't look like that is the same architecture I was using, can you try with the NN architecture in https://github.com/haidark/WaveletDeconv/blob/master/testWD.py?
thanks for reaching out.
Hi @haidark, thanks for replying fast. Here is implementation of the architecture you mentioned. Also I think there might be 2 typos in https://github.com/haidark/WaveletDeconv/blob/master/testWD.py ? Correct me if I make a mistake here:
So when I rerun it, I changed scales to be same in training and validation generating : (0.5x, 1x, 5*x), which indicates the scales should be
print('data_scales = {:.2f}, {:.2f}, {:.2f}'.format(2.*np.pi/0.5, 2.*np.pi/1., 2.*np.pi/5.))
data_scales = 12.57, 6.28, 1.26
Also, I changed the number of filters to 3 in NN. NN implemented: (filter width initialized as (1., 4., 10.))
inp_shape = data.shape[1:]
model = Sequential()
model.add(WaveletDeconvolution(3, kernel_length=500, input_shape=inp_shape, padding='SAME', data_format='channels_first'))
model.add(Activation('tanh'))
model.add(Convolution2D(3, (3, 3), padding='same'))
model.add(Activation('relu'))
#end convolutional layers
model.add(Flatten())
model.add(Dense(25))
model.add(Activation('relu'))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(optimizer='sgd', loss='binary_crossentropy')
Results:
learned scales:
print(model.layers[0].W.numpy())
[ 1.033309 4.0004635 10.006781 ]
Hi,
This is a wonderful work! I was exploring the testing on artificial data part, as 5.1 in your original paper. But I couldn't achieve the result as shown in Figure 3, especially last plot. My naive thought is the vanishing gradients on the learnable filter width in the 1st layer. May I have your suggestions on training on this test data? Based on the architecture description: "We train two networks on examples from each class and compare their performance. The baseline network is a 4 layer CNN with Max-pooling [21] ending with a single unit for classification. The other network replaces the first layer with a WD layer while maintaining the same number of parameters. Both networks are optimized with Adam [20] using a fixed learning rate of 0.001 and a batch size of 4.", I was implementing this network:
inp_shape = data.shape[1:] model = Sequential() model.add(WaveletDeconvolution(3, kernel_length=500, input_shape=inp_shape, padding='SAME', data_format='channels_first')) model.add(Activation('tanh')) # (batch, 1, len=1000, 5) model.add(MaxPool2D((1,2)))
model.add(Convolution2D(3, (3, 3), padding='same')) model.add(Activation('relu')) model.add(MaxPool2D((1,2)))
model.add(Convolution2D(3, (3, 3), padding='same')) model.add(Activation('relu')) model.add(MaxPool2D((1,2)))
model.add(Convolution2D(3, (3, 3), padding='same')) model.add(Activation('relu')) model.add(MaxPool2D((1,2)))
end convolutional layers
model.add(Flatten()) model.add(Dense(25, kernel_initializer=VarianceScaling(mode='fan_avg', distribution='uniform'))) model.add(Activation('relu'))
model.add(Dense(1, kernel_initializer=VarianceScaling(mode='fan_avg', distribution='uniform'))) model.add(Activation('sigmoid'))
optimizer_0 = tf.keras.optimizers.Adam(learning_rate=10.**-3) model.compile(optimizer=optimizer_0, loss='binary_crossentropy')
num_epochs = 25 plt.figure(figsize=(6,6)) Widths = np.zeros((num_epochs, 3)).astype('float32') for i in range(num_epochs): hWD = model.fit(data, labels, epochs=1, batch_size=4, validation_data=(val_data, val_labels), verbose=0)
plt.figure(figsize=(6,6)) for i in range(Widths.shape[1]): plt.plot(range(num_epochs), Widths[:,i])
plt.show()