TensorSpeech / TensorFlowTTS

:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)
https://tensorspeech.github.io/TensorFlowTTS/
Apache License 2.0
3.85k stars 815 forks source link

Full-integer quantization tflite models #802

Open aidansmyth95 opened 5 months ago

aidansmyth95 commented 5 months ago

Has anyone converted the FastSpeech2 or Tacotron models to full-integer quantized tflite models?

My representative dataset generator for FastSpeech2 is returning a floating point exception during conversion, any ideas about what I might be doing wrong? This seems close to enabling the int8x8 tflite model. I need to run on an ARM Ethos-U55 NPU processor, where floating-point support is limited. I don't care so much about quantization error for now, rather profiling it on the U55 once I have a tflite model. We can use tricks like QAT if we need to reduce the quantization error later.

def representative_dataset():
    # Provide a set of input samples that are representative of the data
    # the model will be dealing with during inference
    for _ in range(1):  # Adjust the number of samples as needed
        input_ids = tf.convert_to_tensor(np.random.randint(0, 1, size=(1, 50), dtype=np.int32), dtype=tf.int32)  # Example input shape
        speaker_ids = tf.convert_to_tensor(np.array([1], dtype=np.int32), dtype=tf.int32)
        speed_ratios = tf.convert_to_tensor(np.array([1.0], dtype=np.float32), dtype=tf.float32)
        f0_ratios = tf.convert_to_tensor(np.array([1.0], dtype=np.float32), dtype=tf.float32)
        energy_ratios = tf.convert_to_tensor(np.array([1.0], dtype=np.float32), dtype=tf.float32)
        print(input_ids.shape)
        yield ([input_ids, speaker_ids, speed_ratios, f0_ratios, energy_ratios])
   converter.optimizations = [tf.lite.Optimize.DEFAULT]
    converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
    converter.representative_dataset = representative_dataset
    converter.inference_input_type = tf.int8
    converter.inference_output_type = tf.int8
    converter.target_spec.supported_types = [tf.int8]