fchollet / deep-learning-with-python-notebooks

Jupyter notebooks for the code samples of the book "Deep Learning with Python"
MIT License
18.17k stars 8.53k forks source link

maximising vs minimising activation for visualisation of 1D filters #98

Open chrisclarkson opened 5 years ago

chrisclarkson commented 5 years ago

Apologies in advance, as I am not experienced with deep learning.

I have a set of 1D sequences (each 3000 elements long, each with a label 1-7). I am using a multi-label classifier to identify the distinct patterns from each of the labels.

As per precision-recall, the classifier is doing reasonably well at identifying the different labels (~60-80% accuracy across the different labels).

    model = Sequential()
    model.add(Conv1D(75,2000,strides=1,padding='same', input_shape=X.shape[1:], activation='relu'))
    model.add(MaxPooling1D(2000))
    model.add(Flatten())
    model.add(Dropout(0.2))
    model.add(Dense(len(categories)+1, activation='softmax'))
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['categorical_accuracy'])
    model.fit(train_X, train_y,epochs=3,batch_size=100)

I want to visualise what is being recognised by each of the 75 filters in the convolution layer.

To do this I have implemented an analysis demonstrated in F.Chollet's book: "Deep Learning with Python", page 169. Here the filter is visualised by artificially maximizing its' activation function to produce a sequence that it will respond to/ recognise.

def generate_pattern(layer_name, filter_index, size=3000): #size is set to length of my vectors
        from keras import backend as K
        layer_output = model.get_layer(layer_name).output
        loss = K.mean(layer_output[:, :, filter_index])
        grads = K.gradients(loss, model.input)[0]
        grads /= (K.sqrt(K.mean(K.square(grads))) + 1e-5)
        iterate = K.function([model.input], [loss, grads]) #gradient descent to maximize activation
        input_img_data = input_img_data = np.random.random((1,size, 1)) * 20 + 75
        step = 1
        for i in range(10000):                   #number of iterations does not affect the output
            loss_value, grads_value = iterate([input_img_data])
            input_img_data += grads_value * step
        img = input_img_data[0]
        return img

In the book the example is performed on a 2D image whereas my data are 1D, but I have tried to re-adapt the above code such that it does the same thing for a 1D tensor.

So I have visualised all of the 75 filters and all of them produce a noisy, uninterpretable line.... See below that I have randomly selected and plotted 5 different filters to show what the output typically looks like...

individual_neurons.1.pdf

I would have expected/ hoped for much less noisy lines.... The only other thing that I have not tried is to minimize the activation. Can anyone recommend how to do this? Or am I completely misapplying this technique?