thomasgruebl / rusty-tesseract

A Rust wrapper for Google Tesseract
MIT License
127 stars 16 forks source link

Use `-c` flag before each of configuration variables passed to tesseract #14

Closed vladmovchan closed 1 year ago

vladmovchan commented 1 year ago

When more than one configuration variable is specified, like in the example from https://github.com/thomasgruebl/rusty-tesseract/issues/13#issuecomment-1684948526 -c flag is passed just once before both configuration variables. Like this:

tesseract x.png stdout -l eng --dpi 150 --psm 3 --oem 3 -c tessedit_create_hocr=1 tessedit_char_whitelist=abc

or like this

tesseract x.png stdout -l eng --dpi 150 --psm 3 --oem 3 -c tessedit_char_whitelist=abc tessedit_create_hocr=1 

And in such case tesseract applies only the first key-value pair and seem to ignore the rest (views key1=value1 key2=value2 as a single key1=longer_value pair).

According to tesseract CLI man page -c flag has to be specified before each of configvar=value pair. Something like this:

tesseract x.png stdout -l eng --dpi 150 --psm 3 --oem 3 -c tessedit_char_whitelist=abc -c tessedit_create_hocr=1 

Please take a look into the proposed fix.

thomasgruebl commented 1 year ago

Thanks! Looks great!