weinman / cnn_lstm_ctc_ocr

Tensorflow-based CNN+LSTM trained with CTC-loss for OCR
GNU General Public License v3.0
497 stars 170 forks source link

get not good result, #57

Closed dreamflywhere closed 3 years ago

dreamflywhere commented 5 years ago

I learned a lot from your project. Thank you very much for your sharing. I get two problems. 1, predictions like this: {'labels': array([30], dtype=int64), 'score': array([0.68720126], dtype=float32)} e 0.68720126

the right chars is: Terminates. 'e' or 'E' will to be predicted always.

2, test: print: {'loss': 797.9445, 'mean_label_error': 0.9330578512396694, 'mean_loss': 28.625813, 'mean_sequence_error': 1.0, 'total_loss': 25534.225, 'total_num_label_errors': 6774, 'total_num_labels': 7260, 'total_num_sequence_errs': 892, 'total_num_sequences': 892, 'global_step': 304060}

bad result with my try.

I don't know why? I look over the code but found nothing can explain this. my python version is more than 3.0. Can this cause things to happen?

dreamflywhere commented 5 years ago

other: If i want to know the tensor value which is produced from the intermediate process,how to return it. for example: tf.estimator.Estimator API is used: train_fn function how to print logits and labels values.

with out Estimator, i can print it with session. But in using Estimator, i can not return the logits tensor like this. I don't know the bottom layer work with Estimator.

dreamflywhere commented 5 years ago

lable: [[ 8 38 41 30 26 28 33 34 39 32 0 0 0] [31 37 26 38 30 41 43 40 40 31 30 29 0] [44 46 41 43 30 38 26 28 50 0 0 0 0] [43 30 38 40 29 30 37 30 43 44 0 0 0] [15 30 43 28 34 41 34 30 39 28 30 0 0] ...

I ask for help from other man. He gave me a suggestion.

lable padding is not right.Ctc_blank should follow with the behid of the lables. Try no boundaries and appoint the padding values.

Somebody can help me?