Very inaccurate results with keras-ocr tflite model

I have a requirement where I want to detect text in my android app

I followed fine tuning the recognizer guide and trained my recognizer with borndigital dataset.

Then I created the tflite version of the prediction_model from the previous step -

tflite_name = 'test1.tflite'

converter = tf.lite.TFLiteConverter.from_keras_model(recognizer.prediction_model)
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS, tf.lite.OpsSet.SELECT_TF_OPS]
converter._experimental_lower_tensor_list_ops = False
converter.target_spec.supported_types = [tf.float32]
tflite_model = converter.convert()
open(tflite_name, "wb").write(tflite_model)

def run_tflite_model(image_path, quantization):
    input_data = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
    input_data = cv2.resize(input_data, (200, 31))
    input_data = input_data[np.newaxis]
    input_data = np.expand_dims(input_data, 3)
    input_data = input_data.astype('float32')/255
    interpreter = tf.lite.Interpreter(model_path=tflite_name)
    interpreter.allocate_tensors()

    # Get input and output tensors.
    input_details = interpreter.get_input_details()
    output_details = interpreter.get_output_details()

    input_shape = input_details[0]['shape']
    interpreter.set_tensor(input_details[0]['index'], input_data)

    interpreter.invoke()

    output = interpreter.get_tensor(output_details[0]['index'])
    return output

alphabets = ["0","1","2","3","4","5","6","7","8","9","a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z"]
blank_index = 36

for i in range(1, 20):
    image_path = 'borndigital/test/word_' + str(i) + '.png'
    tflite_output = run_tflite_model(image_path, 'dr')
    final_output = "".join(alphabets[index] for index in tflite_output[0] if index not in [blank_index, -1])
    print("lite model - " + final_output)
    predicted = recognizer.recognize(image_path)
    print("non lite model - " + predicted)

I expected a little difference in the results but did not expect them to be this huge (from the logs)-

lite model - loeaxolea
non lite model - bada

lite model - deveioper
non lite model - developer

lite model - ldedoscaly
non lite model - day

lite model - hhonors
non lite model - hhonors

lite model - nomusnon
non lite model - mluron

lite model - waniewoe
non lite model - wonlwoe

lite model - ihiarsnls
non lite model - thank

lite model - vicodiilino
non lite model - you

lite model - lirayuvrel
non lite model - travel

lite model - insurance
non lite model - insurance

lite model - ineclilil
non lite model - will

lite model - rirgtecte
non lite model - protect

lite model - myefscrdsilirin
non lite model - you

lite model - emdncnl
non lite model - and

lite model - yicriens
non lite model - your

lite model - inelfllegads
non lite model - trip

lite model - lfifosirm
non lite model - from

lite model - lmcexkse
non lite model - unex

lite model - ercecieegc
non lite model - pectd

As you can see in almost every case tflite model predicted wrong whereas the original recognizer's predictions are alright.

Can you suggest what changes I can do to improve the accuracy?

If you need more info, I can provide complete code and logs. All the trainings and predictions are on Mac OSX M1 with following versions -

tensorflow-datasets       4.8.3
tensorflow-deps           2.10.0
tensorflow-estimator      2.9.0
tensorflow-macos          2.9.0
tensorflow-metadata       1.12.0 
tensorflow-metal          0.5.0

Thanks

Not sure if this helps but when converting h5 to tflite I noticed this warning log -

2023-03-29 12:44:25.359575: W tensorflow/compiler/mlir/lite/flatbuffer_export.cc:1901] TFLite interpreter needs to link Flex delegate in order to run the model since it contains the following Select TFop(s):
Flex ops: FlexCTCGreedyDecoder, FlexMatMul, FlexTensorListFromTensor, FlexTensorListGetItem, FlexTensorListReserve, FlexTensorListSetItem, FlexTensorListStack
Details:
    tf.CTCGreedyDecoder(tensor<?x?x37xf32>, tensor<?xi32>) -> (tensor<?x2xi64>, tensor<?xi64>, tensor<2xi64>, tensor<?x1xf32>) : {T = f32, blank_index = -1 : i64, device = "", merge_repeated = true}
    tf.MatMul(tensor<?x1xi32>, tensor<350x1xi32>) -> (tensor<?x350xi32>) : {transpose_a = false, transpose_b = true}
    tf.TensorListFromTensor(tensor<?x?x128xf32>, tensor<2xi32>) -> (tensor<!tf_type.variant<tensor<?x128xf32>>>) : {device = ""}
    tf.TensorListGetItem(tensor<!tf_type.variant<tensor<?x128xf32>>>, tensor<i32>, tensor<2xi32>) -> (tensor<?x128xf32>) : {device = ""}
    tf.TensorListReserve(tensor<2xi32>, tensor<i32>) -> (tensor<!tf_type.variant<tensor<?x128xf32>>>) : {device = ""}
    tf.TensorListSetItem(tensor<!tf_type.variant<tensor<?x128xf32>>>, tensor<i32>, tensor<?x128xf32>) -> (tensor<!tf_type.variant<tensor<?x128xf32>>>) : {device = ""}
    tf.TensorListStack(tensor<!tf_type.variant<tensor<?x128xf32>>>, tensor<2xi32>) -> (tensor<?x?x128xf32>) : {device = "", num_elements = -1 : i64}
See instructions: https://www.tensorflow.org/lite/guide/ops_select

faustomorales / keras-ocr

Very inaccurate results with keras-ocr tflite model #230