Support for Multiply layer

System information

TensorFlow version (you are using): 2.4.1
Are you willing to contribute it (Yes/No): no

Motivation

The implementation of some models (e.g. SENet) requires the use of the Multiply layer

Describe the feature

I tried to quantize a model including a Squeeze-Excitation block but I got an error: Layer multiply:<class 'tensorflow.python.keras.layers.merge.Multiply'> is not supported. You can quantize this layer by passing a tfmot.quantization.keras.QuantizeConfig instance to the quantize_annotate_layer API. It would be very useful to have this layer supported

Describe how the feature helps achieve the use case It would be possible to fully quantize models including this layer

It is common to use tf.keras.layers.Multiply but it lacks support from tfmot. https://github.com/tensorflow/model-optimization/blob/master/tensorflow_model_optimization/python/core/quantization/keras/default_8bit/default_8bit_quantize_registry.py#L162

Does it has any potential risk for adding support of Multiply?

I can surpass the error and run QAT by the code below:

import tensorflow_model_optimization as tfmot
from keras import layers, models

# from
# https://github.com/tensorflow/model-optimization/blob/master/tensorflow_model_optimization/python/core/quantization/keras/default_8bit/default_8bit_quantize_configs.py
class NoOpQuantizeConfig(tfmot.quantization.keras.QuantizeConfig):
    def get_weights_and_quantizers(self, layer):
        return []

    def get_activations_and_quantizers(self, layer):
        return []

    def set_quantize_weights(self, layer, quantize_weights):
        pass

    def set_quantize_activations(self, layer, quantize_anctivations):
        pass

    def get_output_quantizers(self, layer):
        return []

    def get_config(self):
        return {}

def apply_quant_config(layer: layers.Layer):
    if 'multiply' in layer.name:
        return tfmot.quantization.keras.quantize_annotate_layer(layer, quantize_config=NoOpQuantizeConfig())
    return layer

model = get_keras_model(...)  # user defined
annotate_model = models.clone_model(model, clone_function=apply_quant_config)
annotate_model = tfmot.quantization.keras.quantize_annotate_model(annotate_model)
with tfmot.quantization.keras.quantize_scope({'NoOpQuantizeConfig': NoOpQuantizeConfig}):
    annotate_model: models.Model = tfmot.quantization.keras.quantize_apply(annotate_model)

In my project, I can get the following result (the task is to predict image and use RMSE as evaluation metric)

Method	RMSE
FP32	70
PTQ	105
QAT (non-optimized)	75

I think the result is not bad but it takes some time to figure out how to make QAT work. And I'm curious if I'm missing something...

Some links for the issue:

https://github.com/tensorflow/tflite-micro/blob/main/tensorflow/lite/kernels/internal/reference/mul.h

Thanks!

tensorflow / model-optimization

Support for Multiply layer #733