boncey / Flickr4Java

Java API For Flickr. Fork of FlickrJ
BSD 2-Clause "Simplified" License
176 stars 155 forks source link

How to recognize my image using model trained by IAM dataset #658

Closed ahmedalkaddo closed 2 years ago

ahmedalkaddo commented 2 years ago

After training the model with the IAM dataset, I tried to insert any photo (even if I inserted one of the IAM samples images) I still got a different prediction.

Below is my preprocessing code with the prediction:

import numpy as np
import cv2

img = cv2.imread('from.png', cv2.IMREAD_GRAYSCALE)

img = cv2.GaussianBlur(img, (5, 5), 0)

pxmin = np.min(img)
pxmax = np.max(img)
imgContrast = (img - pxmin) / (pxmax - pxmin) * 255

kernel = np.ones((3, 3), np.uint8)
imgMorph = cv2.erode(imgContrast, kernel, iterations = 1)

final_image = cv2.resize(imgMorph, (128 , 32))

print(final_image.shape)
showMe(final_image)

final_image =np.reshape(final_image, (-1,128,32,1))
print(final_image.shape)

preds = prediction_model.predict(final_image) 
pred_texts = decode_batch_predictions(preds)

print(pred_texts)
def decode_batch_predictions(pred):
    input_len = np.ones(pred.shape[0]) * pred.shape[1]

    results = keras.backend.ctc_decode(pred, input_length=input_len, greedy=True)[0][0][
        :, :max_len
    ]
    output_text = []
    for res in results:
        res = tf.gather(res, tf.where(tf.math.not_equal(res, -1)))
        res = tf.strings.reduce_join(num_to_char(res)).numpy().decode("utf-8")
        output_text.append(res)
    return output_text

The output: (32, 128) (1, 128, 32, 1) ['ilbi']

I tried to rotate the image but it still got a wrong prediction.