tensorflow / model-optimization

A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.
https://www.tensorflow.org/model_optimization
Apache License 2.0
1.49k stars 319 forks source link

Quant aware training in tensorflow model optimization #1100

Closed ardeal closed 6 months ago

ardeal commented 11 months ago

Hi,

I did see post_training_quant in model optimization, however, I didn't see quant aware training tutorial or examples.

is there quant aware training in model optimization? is there quant aware training in Tensorflow? could you please tell me the link about quant aware training?

tucan9389 commented 11 months ago

Hi @ardeal, here are some links that might be helpful to start:

Please take a look :)

ardeal commented 11 months ago

Hi @tucan9389 Many many thanks to you for your reply!

I have already found those links you mentioned. The further questions are: 1) If I apply q_aware_model = quantize_model(model) to the model, will BatchNormalization be applied? 2) where and how can I check and set which layer(s) should be quantized or not? is there any examples about how to configure the quantization? 3) I am using the following code to do QAT. did I correctly use the GradientTape?

import tensorflow_model_optimization as tfmot
quantize_model = tfmot.quantization.keras.quantize_model
q_aware_model = quantize_model(model)`
with tf.device('/GPU:0'):
    img, gt_score, gt_geo, ignored_map = img, gt_score, gt_geo, ignored_map
    with tf.GradientTape() as tape:
        pred_score, pred_geo = q_aware_model(img)
        [classify_loss, angle_loss, iou_loss, loss] = loss_tf(gt_score, pred_score, gt_geo, pred_geo, ignored_map)

        gradients = tape.gradient(loss, q_aware_model.trainable_variables)
        optimizer.apply_gradients(zip(gradients, q_aware_model.trainable_variables))
tucan9389 commented 11 months ago

@ardeal

Thanks for asking :-)

Q1. Typically yes. But I recommend checking yourself whether existing FakeQuant or not. But as far as I know, if there is relu after BatchNormalization, BatchNormalization won't be applied. You can check allowlisted layers here. Q2-1. I believe that you can check the FakeQuant via tensorboard's graph visualization. Q2-2. Please check this link out for the example and guide in detail: https://www.tensorflow.org/model_optimization/guide/quantization/training_comprehensive_guide Q3. Once you could get the quantized model (typically quantized tflite model) and successfully get the expected accuracy, it's correct.

Please let me know if you encounter additional questions.

tucan9389 commented 6 months ago

@ardeal

I'll close this issue. Please let me know if you have additional question :)