[Question] Does spkeras work on all Keras Conv layers?

mervess commented 3 years ago

Can you please let me know if spkeras works on all or only some of the conv layer types in the link @Dengyu-Wu? I'm having a bit of a problem with a DepthwiseConv2D layer, hence am asking.

Dengyu-Wu commented 3 years ago

Hi, mervess. Can you show me the error when apply the spkeras to convert DepthwiseConv2D? I do not think it would be a big problem to convert other convolutional layers, since SpKeras calculates the maximum activation by searching the layer with activation attribute.

mervess commented 3 years ago

This was the error I was getting:

Building new model...
Traceback (most recent call last):
  File "simple_example.py", line 37, in <module>
    snn_model = cnn_to_snn(signed_bit=0)(cnn_model,x_train)
  File "spkeras/spkeras/models.py", line 43, in __call__
    timesteps=self.timesteps)
  File "spkeras/spkeras/models.py", line 168, in convert
    snn_model.layers[-1].set_weights(weights)
  File "lib/python3.7/site-packages/tensorflow/python/keras/engine/base_layer.py", line 1783, in set_weights
    'shape %s' % (ref_shape, weight.shape))
ValueError: Layer weight shape (3, 3, 32, 1) not compatible with provided weight shape (3, 3, 32, 32)

The layer in the for loop during the above case is Depthwise. So, to my reasoning, it was mainly about mismatching weight shapes between layers caused by Depthwise operation. Note that because I made some changes in the code, the line numbers might not exactly the same for you.

I reckon I have solved it via minor changes in the code.

I tracked the issue to function batchnormalization where the shapes get affected, then updated its code to maintain the weight shape between layers.

The below is the new version and it seems to work for me.

axis = 1 if K.image_data_format() == 'channels_first' else -1

def batchnormalization( weights, layer, amp_factor=100, dw=False ):
    gamma,beta,mean,variance = layer.get_weights()
    if dw: # for depthwise conv layers
        gamma = np.expand_dims( gamma, axis )
        beta = np.expand_dims( beta, axis )
        mean = np.expand_dims( mean, axis )
        variance = np.expand_dims( variance, axis )
    weights[0] = amp_factor*gamma/np.sqrt(variance+epsilon)*weights[0]
    weights[1] = amp_factor*(gamma/np.sqrt(variance+epsilon)
                                         *(weights[1]-mean)+beta)
    return weights

How does it seem to you?

Dengyu-Wu commented 3 years ago

I see. There is a mismatch between activation channel and weight channel. The activation channel determines the parameter in BatchNormalization layer which causes the shape mismatch.

In my opinion, the single-channel weight in DepthwiseConv2D should be broadcasted to fit the channel size in BatchNormalization, which means you will remove the BatchNormalization by storing more weights rather than single channel for the convolutional layer. It is possible to modify the SpKeras to do this, but it is a bit expensive. You can try using Conv2D with customized channel size to replace DepthwiseConv2D.

I wish I understood the question correctly.

mervess commented 3 years ago

These are the concerning layers and their weight shapes in order:

<tensorflow.python.keras.layers.advanced_activations.ReLU object at 0x7f21ed924dd0> - [] (no weights here)
<tensorflow.python.keras.layers.convolutional.DepthwiseConv2D object at 0x7f21ed88f450> - TensorShape([3, 3, 32, 1])
<tensorflow.python.keras.layers.normalization_v2.BatchNormalization object at 0x7f21ed88fe10> - TensorShape([32])

It got stuck at the DepthwiseConv2D point when getting the error. (3, 3, 32, 32) was the result of func. batchnormalization before my update.

The ReLU above is a layer itself and the overall model does not use bias. I included the activations which come as layers as well by modifying the func. findlambda, as below.

if ... or layer_type == 'ReLU':

I work with many types of CNNs, some do not use bias, no BatchNormalization etc. I intend to keep DepthwiseConv2D in that regard.

So, my short-fix above didn't do the job, did I get it right?

Dengyu-Wu commented 3 years ago

So, my short-fix above didn't do the job, did I get it right?

In my opinion, you did not get the correct weight.

some do not use bias, no BatchNormalization

This is the easiest way to solve the problem by removing bias and BatchNormalization (BN). SpKeras can recognize the layer without bias. Besides, you will not suffer from the mismatch between weight in DepthwiseConv2D and parameter in BN.

mervess commented 3 years ago

if ... or layer_type == 'ReLU':

If I include stand-alone activation layers with this way to findlambda, would that be the right way?

I've tried no bias-no BN version of the model and it works. However, the accuracy remains the same as my pre-edited (wrong weighted) + BN version of the same model. What is the difference between the two versions then indeed?

Dengyu-Wu commented 3 years ago

If I include stand-alone activation layers with this way to findlambda, would that be the right way?

I think there is no need doing this, spkeras does find the correct maximum value. The problem is the mismatch between the shape size of parameters in DepthwiseConv2D and BN.

What is the difference between the two versions then indeed?

I am not sure the exact reason for the same accuracy, since I did not check the details, e.g. the SNN weights after conversion, for the pre-edited version. It might be figured out by checking the weights before and after passing through func: batchnormalization, e.g. the shape and the parameters from BN --- if they are similar even from different channels.

mervess commented 3 years ago

Alright, I see. I've been updating the code to fix some bugs (the original spkeras was crashing directly in some model cases) and will be publishing them soon. I'll send a pull request to you to let you know in case you are interested to make it more robust. Thank you for the answers Dengyu.

Dengyu-Wu commented 3 years ago

That would be very helpful. Good luck with your publication.

I am also working on SpKeras 2.0, which supports more architectures. It is coming soon. :)

Dengyu-Wu / spkeras

[Question] Does spkeras work on all Keras Conv layers? #2