tensorflow / model-optimization

A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.
https://www.tensorflow.org/model_optimization
Apache License 2.0
1.48k stars 320 forks source link

batch norm layer quantization error #1089

Closed mhyeonsoo closed 10 months ago

mhyeonsoo commented 10 months ago

Prior to filing: check that this should be a bug instead of a feature request. Everything supported, including the compatible versions of TensorFlow, is listed in the overview page of each technique. For example, the overview page of quantization-aware training is here. An issue for anything not supported should be a feature request.

Describe the bug A clear and concise description of what the bug is.

System information

TensorFlow version (installed from source or binary): either 2.13.0 or 2.15.0-nightly

TensorFlow Model Optimization version (installed from source or binary): 0.7.5

Python version: 3.10.6

Describe the expected behavior & current behavior I tried to implement quantization aware training on the transfer learning of keras-application-pretrained-model. I used MobilenetV3 base model and added few layers to do the fine-tuning. Referring official document(https://www.tensorflow.org/model_optimization/guide/quantization/training_example), I followed the processes, and met an error saying about,

RuntimeError: Layer batch_normalization:<class 'keras.src.layers.normalization.batch_normalization.BatchNormalization'> is not supported. You can quantize this layer by passing a `tfmot.quantization.keras.QuantizeConfig` instance to the `quantize_annotate_layer` API.

It seems like the Keras Batchnorm layer is not supported for layer quantization.

Code to reproduce the issue

base_model = tf.keras.applications.MobileNetV3Large(
        input_shape=IMG_SHAPE,
        include_top=False,
        weights='imagenet',
        minimalistic=True, 
        include_preprocessing=False
    )
# Freeze the pre-trained model weights
base_model.trainable = False

x = tf.keras.layers.GlobalMaxPooling2D()(base_model.output)
x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.Dropout(0.2, name="top_dropout")(x)
x = tf.keras.layers.Dense(args.num_classes, activation="softmax")(x)

model = tf.keras.Model(base_model.input, x)

# Compile the model
model.compile(optimizer=tf.keras.optimizers.Adam(lr=0.01), 
              loss='categorical_crossentropy',
              metrics=['accuracy']
)

model.fit(train_ds,
             epochs=1,
             validation_data=val_ds,
             steps_per_epoch=n_sample // args.batch_size,
             validation_steps=val_ds.n // args.batch_size,
             verbose=1)

# q_aware stands for for quantization aware.
q_aware_model = quantize_model(model)

# `quantize_model` requires a recompile.
q_aware_model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])
q_aware_model.summary()

Screenshots image Additional context Add any other context about the problem here.

Xhark commented 10 months ago

Currently, BatchNorm is only supported when it's input is from the Conv or Dense. (e.g. Conv-BN, Dense-BN) and it considered the BNs will be FUSED into Conv or Dense after TFLite conversion. We don't considered standalone case for BN.

In MobileNetV3 structure, The output itself without top (backbone only) already normalized right before the output. I don't think we need additional BNs. Is there any specific reason you need them?

mhyeonsoo commented 10 months ago

@Xhark

Yeah, you are right. It already performs normalization before the output. I didn't consider that, and now I removed the batchnorm layer and it works well! Thanks.