Open Jamesswiz opened 6 years ago
Hi! Thanks for your interest 😄 The filter visualization code is based on this blog.
Actually, I'm not the first author (Jongpil Lee) of the paper (SampleCNN), but the author thankfully gave me the code and he says it's ok to share. I tested the code before, and it works like a charm.
However, you need few modification with this repository's implementation. The input size of CNN is currently fixed. You should change the size to None
so that you can run on variable-size signals.
This is the code I received. Good luck!
# dimensions of the generated pictures for each filter.
sample_length = 729 #512
# step size for gradient ascent
step = 1. #3. 1.
step_num = 18 #1000 18
conv_dense = 'conv' # conv or dense
norm_param_list = [1e-9]
layer_list = ['activation_1','activation_2','activation_3','activation_4','activation_5','activation_6']
nb_filters_list = [128,128,128,256,256,256]
fftsize = 729
# load model
weight_path =
model.load_weights(weight_path)
model.summary()
print('model loaded!!!')
# save path
save_path =
# get the symbolic outputs of each "key" layer (we gave them unique names).
layer_dict = dict([(layer.name, layer) for layer in model.layers[1:]])
# this is the placeholder for the input images (None,59049,1)
input_img = model.input
def normalize(x,norm_param):
# utility function to normalize a tensor by its l2 norm
return x / (K.sqrt(K.mean(K.square(x))) + norm_param) # -5?
plt.figure()
# norm_param for loop
for norm_param in norm_param_list:
# all layers for loop
for iter,layer_name in enumerate(layer_list):
print(iter,layer_name)
# save name
save_name = '%s_norm%s_filters.png' % (layer_name,str(norm_param))
print(save_name)
if os.path.isfile(save_path+save_name) == 1:
print('already calculated:',save_name)
continue
nb_filters = nb_filters_list[iter]
repetition = int((fftsize/2+1)/nb_filters)
print('repetition:' + str(repetition))
save_path_wav =
fftzed = np.zeros((nb_filters,fftsize/2+1))
for filter_index in range(0,nb_filters):
# we only scane through the first 10 filters.
# but there are actually ## of them
print('Processing filter %d' % filter_index)
start_time = time.time()
# we build a loss function that maximizes the activation
# of the nth filter of the layer considered
layer_output = layer_dict[layer_name].output
if conv_dense == 'conv':
loss = K.mean(layer_output[:,:,filter_index])
elif conv_dense == 'dense':
loss = K.mean(layer_output[:,filter_index])
# we compute the gradient of the input picture wrt this loss
grads = K.gradients(loss, input_img)[0]
# normalization trick: we normalize the gradient
grads = normalize(grads,norm_param)
# this function returns the loss and grads given the input picture
iterate = K.function([input_img, K.learning_phase()],[loss,grads])
# we start from a gray image with some random noise
input_img_data = np.random.random((1,sample_length,1))
input_img_data = (input_img_data - 0.5) * 0.03 #1.8
# we run gradient ascent for 20 steps
for i in range(step_num):
loss_value, grads_value = iterate([input_img_data,1]) # 0 test phase
input_img_data += grads_value * step
print('Current loss value:', loss_value)
if loss_value <= 0.:
# some filters get stuck to 0, we can skip them
break
end_time = time.time()
print('Filter %d processed in %ds' % (filter_index, end_time - start_time))
print(np.squeeze(input_img_data[0]).shape)
sample = np.squeeze(input_img_data[0])
# erase DC
sample = sample - np.mean(sample)
# save wav figure
save_name_wav = '%s_filter%d_norm%s.png' % (layer_name, filter_index, str(norm_param))
if not os.path.exists(os.path.dirname(save_path_wav+save_name_wav)):
os.makedirs(os.path.dirname(save_path_wav+save_name_wav))
plt.clf()
plt.plot(sample)
plt.axis('off')
plt.savefig(save_path_wav+save_name_wav)
# perform squared magnitude spectra
S = librosa.core.stft(sample,n_fft=fftsize,hop_length=fftsize,win_length=fftsize)
X = np.square(np.absolute(S))
log_S = np.log10(1+10*X)
log_S = np.squeeze(log_S.astype(np.float32))
print(log_S.shape)
#log_S = np.mean(log_S,axis=1)
print(log_S.shape)
fftzed[filter_index] = log_S
print(fftzed.shape,repetition)
argmaxed = np.argmax(fftzed,axis=1)
sort_idx = np.argsort(argmaxed)
sorted_fft = fftzed[sort_idx,:]
sorted_fft = np.repeat(sorted_fft,repetition,axis=0)
print(sorted_fft.shape)
if not os.path.exists(os.path.dirname(save_path+save_name)):
os.makedirs(os.path.dirname(save_path+save_name))
# save figure
plt.clf()
plt.imshow(sorted_fft.T)
plt.gca().invert_yaxis()
plt.axis('off')
plt.savefig(save_path+save_name)
print('save done!!!')
Also, here is the implementation of the first author: https://github.com/jongpillee/sampleCNN
And we are working on extended work of the filter visualization 😄
Hey ! Thanks again for a quick response. I will try your shared implementation. I am also working on CNN filter visualization with raw speech as input. It will be good to start with gradient ascent first.....
May be it will be better to add this code as a txt file attachment......
Hey! Thanks for sharing the implementation. I was wondering if you can also share the gradient accent filter visualization method, compatible with your keras implementation, or may be point me to any links.
Looking forward to a response!