tesseract-ocr / tesstrain

Train Tesseract LSTM with make
Apache License 2.0
599 stars 178 forks source link

unicharset_extractor stuck #384

Closed sinall closed 3 months ago

sinall commented 3 months ago

I tried to use tesstrain following the document. My enviroment was: git bash on Windows. The make training command stucked even I tried to change EOF to LF according to workaround.

$ make training MODEL_NAME=foo --debug GNU Make 4.4.1 Built for Windows32 Copyright (C) 1988-2023 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later https://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Reading makefiles... You are using make version: 4.4.1 Updating makefiles.... Updating goal targets.... File 'training' does not exist. File 'data/foo.traineddata' does not exist. File 'data/foo/checkpoints/foo_checkpoint' does not exist. File 'unicharset' does not exist. File 'data/foo/unicharset' does not exist. Must remake target 'data/foo/unicharset'. unicharset_extractor --output_unicharset "data/foo/unicharset" --norm_mode 2 "data/foo/all-gt"

Is there any way to add --debug or similar argument to unicharset_extractor so I could know what's happening?

sinall commented 3 months ago

It seems that I have both 3.05 and 5.3.3 installed. After I remove the 3.05 from PATH, it works now.