mozilla / DeepSpeech

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
Mozilla Public License 2.0
25.35k stars 3.97k forks source link

How to convert pre-trained deepspeech-0.9.3-models.tflite from float32 to int8 or int16? #3807

Open fzhou-1206 opened 2 months ago

fzhou-1206 commented 2 months ago

I want to run deepspeech.tflite on NPU and need to get deepspeech-0.9.3-models.tflite in int8 or int16 format. How can I use existing deepspeech-0.9.3-models.tflite float32 type to get an int type?

Mari-selvam commented 1 week ago

import tensorflow as tf

# Load the float32 TFLite model
with open("deepspeech-0.9.3-models.tflite", "rb") as f:
    tflite_model = f.read()

# Initialize TFLite Converter with float32 model
converter = tf.lite.TFLiteConverter.from_saved_model("path_to_saved_model")

# Apply Integer Quantization (int8 or int16)
# For int8:
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.int8  # or tf.int16
converter.inference_output_type = tf.int8  # or tf.int16
# Representative dataset generator for quantization
def representative_data_gen():
    # This function should yield data in the same shape as model input
    for _ in range(100):  # Adjust according to your dataset
        yield [your_input_data]  # Replace `your_input_data` with actual samples

converter.representative_dataset = representative_data_gen

# Convert the model
quantized_tflite_model = converter.convert()

# Save the quantized model
with open("deepspeech-quantized.tflite", "wb") as f:
    f.write(quantized_tflite_model)
fzhou-1206 commented 1 week ago

Hi All,

Thanks so much for your reply! But I met one issue with following script, not sure if you have any suggestion.

For this code line “converter = tf.lite.TFLiteConverter.from_saved_model("path_to_saved_model")”, what is the "path_to_saved_model"?

  1. I try “converter = tf.lite.TFLiteConverter.from_saved_model("deepspeech-0.9.3-models.tflite ") “ and it report error “OSError: SavedModel file does not exist at: deepspeech-0.9.3-models.tflite{saved_model.pbtxt|saved_model.pb}”
  2. I try “converter = tf.lite.TFLiteConverter.from_saved_model(tflite_model)” and it report error “UnicodeDecodeError: 'utf-8' codec can't decode byte 0xac in position 36: invalid start byte”

Thanks, Freda

From: Mari Selvam 😉 @.> Sent: 2024年11月1日 21:18 To: mozilla/DeepSpeech @.> Cc: Zhou, Freda @.>; Author @.> Subject: Re: [mozilla/DeepSpeech] How to convert pre-trained deepspeech-0.9.3-models.tflite from float32 to int8 or int16? (Issue #3807)

[External]

import tensorflow as tf

Load the float32 TFLite model

with open("deepspeech-0.9.3-models.tflite", "rb") as f:

tflite_model = f.read()

Initialize TFLite Converter with float32 model

converter = tf.lite.TFLiteConverter.from_saved_model("path_to_saved_model")

Apply Integer Quantization (int8 or int16)

For int8:

converter.optimizations = [tf.lite.Optimize.DEFAULT]

converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]

converter.inference_input_type = tf.int8 # or tf.int16

converter.inference_output_type = tf.int8 # or tf.int16

Representative dataset generator for quantization

def representative_data_gen():

# This function should yield data in the same shape as model input

for _ in range(100):  # Adjust according to your dataset

    yield [your_input_data]  # Replace `your_input_data` with actual samples

converter.representative_dataset = representative_data_gen

Convert the model

quantized_tflite_model = converter.convert()

Save the quantized model

with open("deepspeech-quantized.tflite", "wb") as f:

f.write(quantized_tflite_model)

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https:/github.com/mozilla/DeepSpeech/issues/3807*issuecomment-2451857223__;Iw!!A3Ni8CS0y2Y!6lf8f7S2nl8VIZ1mQguoG-7BJ4BY48KvTGvFnPUVtRmZ5A5JQ-QeBWBATRUn_et27Lm_9w-dbMXnlUU6vmnl6n18dw$, or unsubscribehttps://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/BK2ADMEDMOWJXRFGYCSCCRLZ6N5P7AVCNFSM6AAAAABNFPJVM6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINJRHA2TOMRSGM__;!!A3Ni8CS0y2Y!6lf8f7S2nl8VIZ1mQguoG-7BJ4BY48KvTGvFnPUVtRmZ5A5JQ-QeBWBATRUn_et27Lm_9w-dbMXnlUU6vmnwgNcGNw$. You are receiving this because you authored the thread.Message ID: @.***>