Closed 3timesv closed 5 years ago
I ignore words and lines that have status is 'err'. Details in function get_paths_and_texts() - https://github.com/tuandoan998/Handwriting-OCR/blob/master/Utils.py#L30
Nice work. I still can't figure out how you are feeding data. and function "get_paths_and_texts()" has no in line comments too. I would like to know how have prepared the data(since images have variable dimensions) and feeding it during the model. Will the data be fed dynamically during training?
Nice work. I still can't figure out how you are feeding data. and function "get_paths_and_texts()" has no in line comments too. I would like to know how have prepared the data(since images have variable dimensions) and feeding it during the model. Will the data be fed dynamically during training?
The "get_paths and text ()" function simply takes the corresponding path and text label (ground truth) of the image in the IAM dataset. All images are resized to the same size (w,h=128x64 or 800x64) before being included in the CRNN model.
in CRNN_Model.py,
in this line:
https://github.com/tuandoan998/HTR-for-IAM/blob/ffa2696a744e7c2256282a8eb7712290ad9f4f5e/CRNN_Model.py#L91
Why is inputs not just equal to input_data
, but all of those. What is the reason? Asking because, all I have seen is just images being sent to the inputs and outputs will be the target labels.
Or can I do:
import tensorflow as tf
inputs =
https://github.com/tuandoan998/HTR-for-IAM/blob/ffa2696a744e7c2256282a8eb7712290ad9f4f5e/CRNN_Model.py#L26
.
.## all the remaining layers here.
.
outputs =
https://github.com/tuandoan998/HTR-for-IAM/blob/ffa2696a744e7c2256282a8eb7712290ad9f4f5e/CRNN_Model.py#L80
and then,
model = tf.keras.Model(inputs=inputs, outputs=outputs)
then do:
model.compile(....)
and then fit, as suggested in the documentation like this:
model.fit()
Is this at least close to what you have done?
@naveen-kumar-123 There are two type of model in this, training (model) and inference (model_predict). Each model need different inputs and outputs:
Model(inputs=[input_data, labels, input_length, label_length], outputs=loss_out)
. Model need input_data, label_length,... to compute CTC Loss.Model(inputs=input_data, outputs=y_pred)
. Model just need input_data to predict y_pred.Ok. So, since CTC loss needs all of them to compute loss, they are being sent. And output of CTC loss will be sent for back-propagation, right?
some questions:
@tuandoan998 , Thank you so much for taking the time to reply me. I will read about it.
Could you give the details of the dataset used? (Do you used entire IAM dataset??)