tensorflow / model-optimization

A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.
https://www.tensorflow.org/model_optimization
Apache License 2.0
1.49k stars 323 forks source link

Custom layer with Concat afterwards causes an error during QAT modeling #1124

Open mhyeonsoo opened 7 months ago

mhyeonsoo commented 7 months ago

Prior to filing: check that this should be a bug instead of a feature request. Everything supported, including the compatible versions of TensorFlow, is listed in the overview page of each technique. For example, the overview page of quantization-aware training is here. An issue for anything not supported should be a feature request.

Describe the bug I am building a model that returns feature embedding as an output. I used MobileNetV3Largs as a baseline with include_top=False option.

after the baseline, I have few layers and concat them at the end of the model.

When I tried to apply QAT on the model though, it returns the error saying:

TypeError: 'str' object is not callable

System information

TensorFlow version (installed from source or binary): 2.15.0

TensorFlow Model Optimization version (installed from source or binary): 0.7.5

Python version: 3.11.7

Describe the expected behavior QAT model generated well.

Describe the current behavior

TypeError: 'str' object is not callable

Code to reproduce the issue Provide a reproducible code that is the bare minimum necessary to generate the problem.

def build_model(args, include_preprocessing=False):
    IMG_SHAPE = (args.input_dim, args.input_dim, 3)
    # Transfer learning model with MobileNetV3
    base_model = tf.keras.applications.MobileNetV3Large(
        input_shape=IMG_SHAPE,
        include_top=False,
        weights='imagenet',
        minimalistic=True,
        include_preprocessing=include_preprocessing
    )
    # Freeze the pre-trained model weights
    base_model.trainable = False
    cl1 = CustomLayer()(base_model.output)
    cl1 = tf.keras.layers.Dropout(0.2, name="dropout_cl1")(cl1)

    cl2 = CustomLayer()(base_model.output)
    cl2 = tf.keras.layers.Dropout(0.2, name="dropout_gd2")(gd2)

    cl3 = CustomLayer()(base_model.output)
    cl3 = tf.keras.layers.Dropout(0.2, name="dropout_cl1")(cl3)

    concat_cls = tf.keras.layers.Concatenate()([cl1, cl2, cl3])

    x = tf.keras.layers.Dense(512, activation='swish')(concat_cls) # No activation on final dense layer
    model = tf.keras.Model(base_model.input, x)

    return model

# This is initial training loop before QAT --> this returns few epoch trained model
model = build_model(args)
model = initial_training(args, model)

def apply_quantization_to_dense(layer):
    if isinstance(layer, tf.keras.layers.Dense):
        return tfmot.quantization.keras.quantize_annotate_layer(layer)
    return layer

annotated_model = tf.keras.models.clone_model(
    model,
    clone_function=apply_quantization_to_dense,
)

quant_aware_model = tfmot.quantization.keras.quantize_apply(annotated_model)

Screenshots If applicable, add screenshots to help explain your problem.

Additional context Add any other context about the problem here.

tucan9389 commented 7 months ago

@mhyeonsoo

Thanks for your reporting.

Could you give complete script, including CustomLayer implementation, if available?

(Reproducible and runnable Colab link is also good to communicate.)

mhyeonsoo commented 7 months ago

@tucan9389

Thanks for the response. due to the policy, I am not able to share the full code or reproducible colab. But I can share the CustomLayer implementation I guess. Below is the one that I am using.

class CustomLayer(tf.keras.layers.Layer):
    def __init__(self, p=1, **kwargs):
        super(CustomLayer, self).__init__(**kwargs)
        self.p = p

    def call(self, inputs):
        assert len(inputs.shape) == 4, 'the input tensor of CustomLayer must be the shape of [B, H, W, C]'
        if self.p == 1:
            return tf.reduce_mean(inputs, axis=[1, 2])
        elif self.p == float('inf'):
            return tf.reshape(tf.reduce_max(inputs, axis=[1, 2]), shape=[-1, inputs.shape[-1]])
        else:
            sum_value = tf.reduce_mean(tf.pow(inputs, self.p), axis=[1, 2])
            return tf.sign(sum_value) * tf.pow(tf.abs(sum_value), 1.0 / self.p)

    def get_config(self):
        config = super(CustomLayer, self).get_config()
        config.update({"p": self.p})
        return config

please feel free to tell me if any other things are needed!

thanks,

mhyeonsoo commented 7 months ago

@tucan9389

Hi, can I ask if there is any update for this?

Thanks,