:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)
Has anyone converted the FastSpeech2 or Tacotron models to full-integer quantized tflite models?
My representative dataset generator for FastSpeech2 is returning a floating point exception during conversion, any ideas about what I might be doing wrong? This seems close to enabling the int8x8 tflite model. I need to run on an ARM Ethos-U55 NPU processor, where floating-point support is limited. I don't care so much about quantization error for now, rather profiling it on the U55 once I have a tflite model. We can use tricks like QAT if we need to reduce the quantization error later.
def representative_dataset():
# Provide a set of input samples that are representative of the data
# the model will be dealing with during inference
for _ in range(1): # Adjust the number of samples as needed
input_ids = tf.convert_to_tensor(np.random.randint(0, 1, size=(1, 50), dtype=np.int32), dtype=tf.int32) # Example input shape
speaker_ids = tf.convert_to_tensor(np.array([1], dtype=np.int32), dtype=tf.int32)
speed_ratios = tf.convert_to_tensor(np.array([1.0], dtype=np.float32), dtype=tf.float32)
f0_ratios = tf.convert_to_tensor(np.array([1.0], dtype=np.float32), dtype=tf.float32)
energy_ratios = tf.convert_to_tensor(np.array([1.0], dtype=np.float32), dtype=tf.float32)
print(input_ids.shape)
yield ([input_ids, speaker_ids, speed_ratios, f0_ratios, energy_ratios])
Has anyone converted the FastSpeech2 or Tacotron models to full-integer quantized tflite models?
My representative dataset generator for FastSpeech2 is returning a floating point exception during conversion, any ideas about what I might be doing wrong? This seems close to enabling the int8x8 tflite model. I need to run on an ARM Ethos-U55 NPU processor, where floating-point support is limited. I don't care so much about quantization error for now, rather profiling it on the U55 once I have a tflite model. We can use tricks like QAT if we need to reduce the quantization error later.