Closed manojpreveen closed 3 years ago
Tried converting the joint-bert keras (.h5) model in the output directory to TFLite.
converter = tf.compat.v1.lite.TFLiteConverter.from_keras_model_file('saved_models/joint_bert_model/joint_bert_model.h5',custom_objects={'KerasLayer':hub.KerasLayer})
tflite_model = converter.convert()
with tf.io.gfile.GFile(os.path.join("./", 'joint_bert.tflite'), 'wb') as f: f.write(tflite_model)
This converted the keras (.h5) joint-bert model to tflite model successfully.
But when I tried to do inference from this again I was getting dimension mismatch value error.
Code for printing input and output details of this tflite model :
with tf.io.gfile.GFile("joint_bert.tflite", 'rb') as f: model_content = f.read()
interpreter = tf.lite.Interpreter(model_content=model_content) interpreter.allocate_tensors() input_details = interpreter.get_input_details() output_details = interpreter.get_output_details()
print(input_details) print(output_details)
Output :
[{'name': 'input_word_ids', 'index': 0, 'shape': array([1, 1], dtype=int32), 'shape_signature': array([-1, -1], dtype=int32), 'dtype': <class 'numpy.int32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}, {'name': 'input_mask', 'index': 1, 'shape': array([1, 1], dtype=int32), 'shape_signature': array([-1, -1], dtype=int32), 'dtype': <class 'numpy.int32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}, {'name': 'input_type_ids', 'index': 2, 'shape': array([1, 1], dtype=int32), 'shape_signature': array([-1, -1], dtype=int32), 'dtype': <class 'numpy.int32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}, {'name': 'valid_positions', 'index': 3, 'shape': array([ 1, 1, 73], dtype=int32), 'shape_signature': array([-1, -1, 73], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}]
[{'name': 'Identity', 'index': 1612, 'shape': array([1, 7], dtype=int32), 'shape_signature': array([-1, 7], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}, {'name': 'Identity_1', 'index': 1613, 'shape': array([ 1, 1, 73], dtype=int32), 'shape_signature': array([-1, -1, 73], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}]
This input and output details for the tflite model is wrong, as you can see "shape: array([1, 1], dtype=int32)" for index:0,1 for input and etc again.
Hi @tromedlov22 I didn't try to convert joint_bert and joint_albert models (aka The Tensorflow-hub based models), but I tried to convert the transformers-based models in my internal projects (not this repo).
Here are a few tips that I learned the hard way:
1- transformers models need feature (flex delegate) that available in python only starting from TF 2.4.0 and since it is not released yet, you can use tf-nightly.
2- None
in model's input shapes is converted to 1 that is why you get shape [1, 1]
[{'name': 'input_word_ids', 'index': 0, 'shape': array([1, 1], dtype=int32) ...
I'm planing to support the tflite conversion and serving pool for high throughput. Please, keep an eye on #24 and #25 Meanwhile, If you are interested in a compressed model, I do recommend using layer pruning feature as it is very efficient and implemented in this repo.
Please, let me know if it helped.
Tried few ways to convert the joint-albert keras (.h5) model in the output directory to TFLite.
Error Log :
The above conversion worked and I got the tflite model, but when I tried inference noticed that the conversion is messed and the tflite model's input_details and output_details are wrong.
Code for inference for the tflite model got above :
Output :
This input and output details for the tflite model is wrong, as you can see "shape: array([1, 1], dtype=int32)" for index:0,1 for input and etc.
Is there a way to convert the joint-albert model to tflite and run inference on it? Or is this model not supported yet?
Please help.