Closed Arithmeticus closed 6 years ago
I began my quest from that site. But the @nickjwhite git repos lack the requisite Tesseract files that are in all the other langdata subdirectories. Even if the files are somehow at ancientgreekocr.org, the tesseract langdata repo should have a /grc
subdirectory populated with the build files.
See Pull Request by Nick White - https://github.com/tesseract-ocr/langdata/pull/19
and
https://groups.google.com/forum/#!topic/tesseract-dev/Iqsa7y2g3sk
Thanks for answering this @Shreeshrii.
@Arithmeticus, note that the files in the langdata repository are designed to be used as input to tesstrain.sh from tesseract's training/ directory, which is why some of the files you may be expecting such as .inttemp aren't present. That is the same with all of the langdata directories.
There will be grc source files in the next release of langdata. It will be missing desired_characters and forbidden_characters unless you would like to contribute some...
https://github.com/tesseract-ocr/langdata/tree/master/grc
PR by @nickjwhite has been merged.
@zdenop This can be closed.
There is a
grc.trainddata
file, but no corresponding/grc
subdirectory with the build files. Could that be supplied? Or is there a safe way to split a.traineddata
file into its constituent parts?