tesseract-ocr / tesseract

Tesseract Open Source OCR Engine (main repository)
https://tesseract-ocr.github.io/
Apache License 2.0
62.18k stars 9.5k forks source link

TensorFlow code in Tesseract #4349

Closed amitdo closed 2 hours ago

amitdo commented 4 hours ago

All the files under src/lstm that stars with tf.

It is useless. There is no way to train a model in TF that Tesseract can use for inference.

Can we finally remove the TF code from Tesseract?

stweil commented 3 hours ago

It's only remaining use for me is that the code shows how to add another OCR engine, for example to use Kraken models with optional GPU support (on my wish list).

amitdo commented 3 hours ago

IIRC, the code in Tesseract uses some internal TF 1.x functionality instead of a stable API.

stweil commented 2 hours ago

I think I can look in old releases how an OCR engine was added, so we can indeed remove the TensorFlow code now.