Open MrRace opened 1 year ago
Does Post-training full integer quantization in https://www.tensorflow.org/lite/performance/post_training_integer_quant#convert_using_float_fallback_quantization support BERT? I convert my pb model to tf lite:
dataset = create_dataset() def representative_dataset(): for data in dataset: yield { "token_type_ids": np.array(data.segment_ids), "attention_mask": np.array(data.input_mask), "input_ids": np.array(data.input_ids), } converter = tf.lite.TFLiteConverter.from_saved_model(pb_dir) converter.optimizations = [tf.lite.Optimize.DEFAULT] converter.representative_dataset = representative_dataset tflite_quant_model = converter.convert() tflite_path = res_tf_lite_file open(tflite_path, "wb").write(tflite_quant_model) assert os.path.exists(tflite_path) print("tflite model={} converted successfully.".format(tflite_path)) interpreter = tf.lite.Interpreter(model_path=tflite_path) # Get input and output tensors input_details = interpreter.get_input_details() output_details = interpreter.get_output_details() print(f'tflite input {input_details}') print(f'tflite output {output_details}')
I use float fallback quantization from https://www.tensorflow.org/lite/performance/post_training_integer_quant. However the result is totally different compare to the not quantization result. Anyone can help? Thanks a lot!
@yyoon Could you help to solve it? Thanks a lot!
Does Post-training full integer quantization in https://www.tensorflow.org/lite/performance/post_training_integer_quant#convert_using_float_fallback_quantization support BERT? I convert my pb model to tf lite:
I use float fallback quantization from https://www.tensorflow.org/lite/performance/post_training_integer_quant. However the result is totally different compare to the not quantization result. Anyone can help? Thanks a lot!