Open 9tontruck opened 8 years ago
Anyone?
Sorry not sure myself, I haven't really used the user_patterns_suffix myself. Maybe try stack overflow. On 29 Feb 2016 6:55 a.m., "9tontruck" notifications@github.com wrote:
Anyone?
— Reply to this email directly or view it on GitHub https://github.com/charlesw/tesseract/issues/248#issuecomment-190056952.
9tontruck, did you end up having any success with this? Working on getting this working as well.
I had no luck :( I am still seeking for it. Did you make any progress?
Having the same problem. I monitored the files with Process Monitor while running Tesseract. Engine never access eng.user-patterns and eng.user-words.
Then I decided to debug the method: dict/dict.cpp::void Dict::Load(DawgCache *dawg_cache)
variables user_patterns_suffix user_patterns_file user_words_suffix and user_words_file have no value at the Load() start. I am sure I passed them to TessBaseAPI::SetVariable before the Load() start.
It is certainly a bug of libtesseract. Sorry, no time to investigate more.
UPD: I got it working when passing the variables directly to the Init() method. I think Init resets variables at start so TessBaseAPI::SetVariable does not work. Tesseract Engine loads user patterns inside the Init() method, hence calling TessBaseAPI::SetVariable after Init does not work too.
9tontruck, I don't know what .Net wrapper you use. Browse your .Net API for the Init method and pass variables there. If it's absent, search for another wrapper or use C++.
UPD2 I failed making it work. User words and patterns do not affect recognition results. Tesseract loads these files but it does not make sense. :(
:( I'd love to be able to use the bazaar / bazzar config, too :( here's my SO question: http://stackoverflow.com/questions/40127994/how-to-give-tesseract-a-word-list-net-wrapper
I have the same problem.
HI, I am trying to give a string pattern into TesseractEngine object when it is initiated. I am using "A .Net wrapper for tesseract-ocr" 3.0.1.0 in C#.
Here is my code:
C# code
tessdata/configs/bazzar
tessdata/eng.user-patterns
tessdata/eng.user-words
TestImage.jpg
Output from tesseract:
I have successfully inserted user-words and user-patterns into the tesseract object. But the tesseract doesn't seem to refer to my user-words list because it keeps returning HAR instead of MAR. How can I force to read \w\w\w in the user-words list?