tensorflow / model-optimization

A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.
https://www.tensorflow.org/model_optimization
Apache License 2.0
1.48k stars 320 forks source link

Post training quantize TTS model fastspeech2 and failed #1010

Open zhuyu14 opened 1 year ago

zhuyu14 commented 1 year ago

Hi Everyone,

I tried to post quantize fastspeech2 model of TensorFlowTTS ,but got the error message, I was wondering if anyone knows the detials of this error and how to fix this ? Thanks!

Use python version : 3.8.10 Use tensorflow version : 2.8.0

test code:

import tensorflow as tf
import yaml
import numpy as np
from tensorflow_tts.configs import FastSpeech2Config
from tensorflow_tts.models import TFFastSpeech2

  def representative_data_gen():

       data = np.array([38,51,41,11,51,52,57,11,57,2,11,45,46,56,11,55,38,40,42,7,148])  
       data = data.reshape(1,-1)
       input_ids = tf.convert_to_tensor(data, dtype=tf.int32)

        yield [input_ids,
         tf.convert_to_tensor([0], tf.int32),
         tf.convert_to_tensor([1.0], dtype=tf.float32),
         tf.convert_to_tensor([1.0], dtype=tf.float32),
         tf.convert_to_tensor([1.0], dtype=tf.float32)]

  with open('config.yml') as f:
      config = yaml.load(f, Loader=yaml.Loader)

  config = FastSpeech2Config(**config["fastspeech2_params"])
  fastspeech2 = TFFastSpeech2(config=config, enable_tflite_convertible=True, name="fastspeech2")
  fastspeech2._build()
  fastspeech2.load_weights("model.h5")

  fastspeech2_concrete_function = fastspeech2.inference_tflite.get_concrete_function()
  converter = tf.lite.TFLiteConverter.from_concrete_functions([fastspeech2_concrete_function])
  converter.optimizations = [tf.lite.Optimize.DEFAULT]
  converter.representative_dataset = representative_data_gen
  converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
  converter.inference_input_type = tf.compat.v1.lite.constants.INT8
  converter.inference_output_type = tf.compat.v1.lite.constants.INT8

  tflite_model_quant = converter.convert()

error log:

 2022-08-23 09:18:33.257335: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1164] Optimization results for grappler item: graph_to_optimize
    function_optimizer: function_optimizer did nothing. time = 0.011ms.
    function_optimizer: function_optimizer did nothing. time = 0.001ms.

  /usr/local/lib/python3.8/dist-packages/tensorflow/lite/python/convert.py:746: UserWarning: Statistics for quantized inputs were expected, but not specified; continuing anyway.
    warnings.warn("Statistics for quantized inputs were expected, but not "
  2022-08-23 09:18:35.605349: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:357] Ignored output_format.
  2022-08-23 09:18:35.605393: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:360] Ignored drop_control_dependency.
  fully_quantize: 0, inference_type: 6, input_inference_type: 9, output_inference_type: 9
  error: illegal scale: INF
  Segmentation fault (core dumped)

Models information and weights are from here: https://huggingface.co/tensorspeech/tts-fastspeech2-ljspeech-en/tree/main https://github.com/TensorSpeech/TensorFlowTTS

thaink commented 1 year ago

error: illegal scale: INF That probably mean there is INF value in one tensor in the model.

zhuyu14 commented 1 year ago

hi @thaink , is there any way to know which tensor gets the illegal value when quantize the model ?

thaink commented 1 year ago

It seems not straight forward now, the model is converted to tflite before quantization so we might have to add log somewhere in the calibration part.

I wonder, there seems to be TensorflowTTS model for TFLite (https://github.com/TensorSpeech/TensorFlowTTS/tree/master/examples/cpptflite). Isn't that suitable for your case?

zhuyu14 commented 1 year ago

hi @thaink , yes the model is from the site you showed and the model there is float tflite model ,but in my case , I need int8 tflite model to accelerate inference speed .

thaink commented 1 year ago

I see. @sngyhan TTS model will require dynamic-range quantization, right? Can you guide @zhuyu14 how to do that?