ropensci / tesseract

Bindings to Tesseract OCR engine for R
https://docs.ropensci.org/tesseract
244 stars 26 forks source link

small PR to download contributed models #68

Closed pachadotdev closed 2 months ago

pachadotdev commented 3 months ago

at some point we will provide a model for Romanian text, so here is a starting point for the already available contributed languages

jeroen commented 3 months ago

tessdata_contrib only has training data for polytonic and Akkadian? Does anyone really use this?

pachadotdev commented 3 months ago

Akkadian

yes, I briefly spoke to the librarian and at UofT Libraries before starting to use Tesseract, they are digitizing a lot of ancient and non-ancient things that go to a Postgres database

I organized this because I am creating my own data for Romanian and I shall upload it in September