Closed mikegerber closed 3 years ago
https://pypi.org/project/calamari-ocr/ is now at 1.0.1
I'm going to have to re-train my GT4HistOCR model for 1.0.x for this update to be useful.
I'm currently training a new model for 1.0.x, so this is coming.
This would also avoid wrestling the damn "tensorflow vs tensorflow-gpu problem"
Unfortunately, it's not done with just using Calamari 1.x. With this change
diff --git a/requirements.txt b/requirements.txt
index 0a426e0..53a18b0 100644
--- a/requirements.txt
+++ b/requirements.txt
@@ -1,6 +1,6 @@
numpy
-tensorflow-gpu == 1.15.*
-calamari-ocr == 0.3.5
+tensorflow-gpu == 2.2.*
+calamari-ocr == 1.0.*
setuptools >= 41.0.0 # tensorboard depends on this, but why do we get an error at runtime?
click
ocrd >= 2.2.1
I get a hundreds of these messages:
18:07:46.116 WARNING tensorflow - 11 out of the last 11 calls to <function Model.make_predict_function.<locals>.predict_function at 0x7f1f304e3c10> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings is likely due to passing python objects instead of tensors. Also, tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. Please refer to https://www.tensorflow.org/tutorials/customization/performance#python_or_tensor_args and https://www.tensorflow.org/api_docs/python/tf/function for more details.
However it seems to produce good results using my model for 1.0 (left: GT text, right: updated ocrd_calamari OCR text):
@kba @maxnth
Waiting for https://github.com/Calamari-OCR/calamari/issues/180 to review the API usage here.
AFAIK the warning message about retracing in newer TensorFlow 2 versions doesn't show an error per se but only hints that the prediction can be implemented more efficiently in TensorFlow 2. So the results themselves shouldn't be influenced by it. But I'll look into it.
18:07:46.116 WARNING tensorflow
Unfortunate that the logger is not further namespaced, so we cannot selectively disable these log messages, should they indeed turn out to be unproblematic-to-ignore log spam.
Using Calamari 1.0/TF 2.2 my tests are around 5 times slower and I suspect that the retracing is the issue. I'll have a look if fixing #20 solves the warning problem too, as we're doing the most inefficient prediction - line by line - anyway.
Alright I did some testing using https://github.com/OCR-D/ocrd_calamari/commit/93190fae3b3d8b5b9a68b37f604c43c34979e5d4, so I am putting all lines of a region into Calamari predict_raw
. It's still 2x slower than using Calamari 0.3.5.
Using Python 3.7 (on 3.8 it's not possible to install TF 1.15):
% pip list | egrep '^calamari|^tensorflow'
calamari-ocr 0.3.5
tensorflow-estimator 1.15.1
tensorflow-gpu 1.15.3
% rm -rf gt4histocr-calamari; for i in `seq 3`; do make test | tail -1; done
[...]
=================== 3 passed, 1 warning in 113.26s (0:01:53) ===================
=================== 3 passed, 1 warning in 116.47s (0:01:56) ===================
=================== 3 passed, 1 warning in 131.08s (0:02:11) ===================
% pip list | egrep '^calamari|^tensorflow'
calamari-ocr 1.0.5
tensorflow 2.2.0
tensorflow-estimator 2.2.0
tensorflow-gpu 2.2.0
======================== 3 passed in 562.10s (0:09:22) =========================
======================== 3 passed in 564.67s (0:09:24) =========================
======================== 3 passed in 556.99s (0:09:16) =========================
======================== 3 passed in 244.04s (0:04:04) =========================
======================== 3 passed in 230.55s (0:03:50) =========================
======================== 3 passed in 222.86s (0:03:42) =========================
It's still slower and I still get (a lot less) retracing warnings using a test document:
15:24:43.365 WARNING tensorflow - 5 out of the last 5 calls to <function Model.make_predict_function.<locals>.predict_function at 0x7fe88c3c8ef0> triggered t
15:24:43.992 WARNING tensorflow - 5 out of the last 5 calls to <function Model.make_predict_function.<locals>.predict_function at 0x7fe88c371320> triggered t
15:24:44.634 WARNING tensorflow - 5 out of the last 5 calls to <function Model.make_predict_function.<locals>.predict_function at 0x7fe88c153ef0> triggered t
15:24:45.260 WARNING tensorflow - 5 out of the last 5 calls to <function Model.make_predict_function.<locals>.predict_function at 0x7fe874eddef0> triggered t
15:24:45.835 WARNING tensorflow - 5 out of the last 5 calls to <function Model.make_predict_function.<locals>.predict_function at 0x7fe87433f830> triggered t
Judging from https://github.com/Calamari-OCR/calamari/blob/master/calamari_ocr/ocr/backends/model_interface.py#L61 we seem to be using the API correctly, i.e. passing a List[np.array]
. I'll see if I can trigger these warnings using Calamari's own tools.
@maxnth any thoughts?
Hi, I had the same errors recently and I believe I got it fixed for my setup (tensorflow 2.1 and tensorflow 2.3) with https://github.com/Calamari-OCR/calamari/compare/fix-162. The problem has to be somewhere in tensorflow_model.py. The lines around 224 might also work as
dataset = tf.data.Dataset.from_generator(gen, output_signature=(
tf.TensorSpec((None, line_height, self.input_channels), tf.float32),
tf.TensorSpec((None,), tf.int32),
tf.TensorSpec((1,), tf.int32),
tf.TensorSpec((1,), tf.int32),
tf.TensorSpec((1,), tf.string)))
but I'm not sure about that, it has been a while. I'm also not sure what the change is going to do in connection with other tensorflow versions.
Unfortunately, I don't have a testing setup at the moment and had no success at quickly installing the current tensorflow without running into CUDA/NVIDIA problems again... We really need some proper testing and fixed versions for tensorflow, otherwise this is just going to produce problems again and again. Edit: Maybe the main problem is the TF version: It could be the case that only 2.2 is broken?
@andbue I did some tests using calamari-predict alone and using TensorFlow 2.3rc2 reduces the amount of warnings a lot.
I updated the requirements for https://github.com/OCR-D/ocrd_calamari/tree/feat/update-calamari1 and will test the performance again on Monday (tested it on a different PC).
With Calamar 1.0.x, TF 2.3rc2 and not doing the recognition line by line I get comparable or better performance than using Calamari 0.3.5 and line by line:
% rm -rf gt4histocr-calamari; for i in `seq 3`; do make test | tail -1; done
[ model download output ]
========================= 3 passed in 97.67s (0:01:37) =========================
========================= 3 passed in 97.16s (0:01:37) =========================
======================== 3 passed in 102.22s (0:01:42) =========================
There are still a lot warning messages, so this is not resolved 100% yet.
I am going to wait with releasing until TF 2.3.0 is out or a new Calamari release with this issue fixed is out.
TF 2.3.0 is out
Still have some issues with this I have to investigate.
Hi Mike, if you run into any problems with calamari and TF 2.3, make sure to try the current master! I made some changes in https://github.com/Calamari-OCR/calamari/pull/184. The package on pypi is outdated at the moment.
I've measured CER rates using the Calamari 1 branch and they're on par with the Calamari 0.3 branch (=master).
Macro Median CER: | OCR-CALA-gt4histocr | OCR-CALA1-gt4histocr | OCR-TESS-fraktur | |
---|---|---|---|---|
BINPAGE-sauvola-SEG-LINE-sbb | 0.041 | 0.042 | 0.050 | |
BINPAGE-sauvola-SEG-LINE-tess | 0.099 | 0.100 | 0.095 | |
BINPAGE-sauvola-SEG-LINE-cisocro | 0.043 | 0.044 | 0.056 | |
OCR-D-GT-PAGE-BINPAGE-sauvola | 0.028 | 0.030 | 0.034 |
Macro Mean CER: | OCR-CALA-gt4histocr | OCR-CALA1-gt4histocr | OCR-TESS-fraktur | |
---|---|---|---|---|
BINPAGE-sauvola-SEG-LINE-sbb | 0.066 | 0.067 | 0.075 | |
BINPAGE-sauvola-SEG-LINE-tess | 0.186 | 0.185 | 0.152 | |
BINPAGE-sauvola-SEG-LINE-cisocro | 0.236 | 0.224 | 0.344 | |
OCR-D-GT-PAGE-BINPAGE-sauvola | 0.042 | 0.044 | 0.090 |
Dataset:
Models:
→ I'll be merging https://github.com/OCR-D/ocrd_calamari/tree/feat/update-calamari1, and open another issue as I'm still seeing some runtime performance increase.
PyPI is still at version 0.3.5 – I'll wait until it has 1.0.0.