emedvedev / attention-ocr

A Tensorflow model for text recognition (CNN + seq2seq with visual attention) available as a Python package and compatible with Google Cloud ML Engine.
MIT License
1.08k stars 256 forks source link

Predictions broken ? #134

Closed louis030195 closed 5 years ago

louis030195 commented 5 years ago

i've trained aocr on synth90k dataset (default hyperparams), it reached 82% accuracy on test set, then tried to predict some images from the synth90k dataset

!aocr predict --model-dir /content/exported-model

image

Is it supposed to works ? Also tried with other dataset nothing better ...

emedvedev commented 5 years ago

It's supposed to work, and 82% accuracy on synth90k is actually pretty low. Could you also include training/testing logs and the exact commends you use for training and testing, too? I may not have time to take a look right away, but that would help in any case.

louis030195 commented 5 years ago

Training aocr train --initial-learning-rate=1 training.tfrecords

Sample log

2019-06-07 23:19:19,182 root INFO Step 25292: 0.212s, loss: 0.316488, perplexity: 1.372300. 2019-06-07 23:19:19,409 root INFO Step 25293: 0.214s, loss: 0.368399, perplexity: 1.445419. 2019-06-07 23:19:19,636 root INFO Step 25294: 0.211s, loss: 0.303288, perplexity: 1.354305. 2019-06-07 23:19:19,861 root INFO Step 25295: 0.213s, loss: 0.395718, perplexity: 1.485451. 2019-06-07 23:19:20,080 root INFO Step 25296: 0.207s, loss: 0.345288, perplexity: 1.412397. 2019-06-07 23:19:20,299 root INFO Step 25297: 0.208s, loss: 0.374361, perplexity: 1.454062. 2019-06-07 23:19:20,520 root INFO Step 25298: 0.208s, loss: 0.251078, perplexity: 1.285411. 2019-06-07 23:19:20,741 root INFO Step 25299: 0.210s, loss: 0.279711, perplexity: 1.322748. 2019-06-07 23:19:20,959 root INFO Step 25300: 0.207s, loss: 0.297337, perplexity: 1.346269. 2019-06-07 23:19:20,960 root INFO Global step 536251. Time: 0.209, loss: 0.313579, perplexity: 1.37.

Testing aocr test --visualize ./testing.tfrecords

Sample log

2019-06-08 08:16:28,573 root INFO Step 78 (0.039s). Accuracy: 86.15%, loss: 0.054394, perplexity: 1.05590, probability: 72.15% 100% (24920) 2019-06-08 08:16:28,944 root INFO Step 79 (0.040s). Accuracy: 86.33%, loss: 0.044648, perplexity: 1.04566, probability: 76.50% 100% (64030) 2019-06-08 08:16:29,102 root INFO Step 80 (0.039s). Accuracy: 86.25%, loss: 0.649906, perplexity: 1.91536, probability: 2.39% 80% (82154 vs 82144) 2019-06-08 08:16:29,580 root INFO Step 81 (0.041s). Accuracy: 86.17%, loss: 0.315767, perplexity: 1.37131, probability: 40.29% 80% (66340 vs 66341) 2019-06-08 08:16:29,740 root INFO Step 82 (0.040s). Accuracy: 86.34%, loss: 0.029022, perplexity: 1.02945, probability: 84.02% 100% (84109) 2019-06-08 08:16:30,003 root INFO Step 83 (0.040s). Accuracy: 86.27%, loss: 0.200633, perplexity: 1.22218, probability: 31.06% 80% (34449 vs 34448) 2019-06-08 08:16:30,473 root INFO Step 84 (0.041s). Accuracy: 86.43%, loss: 0.007167, perplexity: 1.00719, probability: 96.48% 100% (1071) 2019-06-08 08:16:30,737 root INFO Step 85 (0.040s). Accuracy: 86.35%, loss: 0.130487, perplexity: 1.13938, probability: 52.44% 80% (84340 vs 84349) 2019-06-08 08:16:31,001 root INFO Step 86 (0.042s). Accuracy: 86.51%, loss: 0.063279, perplexity: 1.06532, probability: 68.41% 100% (87986) 2019-06-08 08:16:31,267 root INFO Step 87 (0.040s). Accuracy: 86.67%, loss: 0.034380, perplexity: 1.03498, probability: 81.36% 100% (70012)

I've trained for 1 week now on GTX 1080, global step ~530k, 86% on test set

emedvedev commented 5 years ago

Is the result different when you run aocr test on the dataset containing only the image you're running predict on? If yes, are you running the latest master (https://github.com/emedvedev/attention-ocr/pull/132)?