antimatter15 / tesseract-rs

Rust bindings for Tesseract
MIT License
144 stars 31 forks source link

[Feature] Support setting segmentation mode #33

Closed olalonde closed 1 year ago

olalonde commented 1 year ago

E.g. api->SetPageSegMode(tesseract::PSM_AUTO_OSD); in https://tesseract-ocr.github.io/tessdoc/Examples_C++.html

olalonde commented 1 year ago

There's a tessedit_pageseg_mode which might do the same thing? For now using .set_variable("tessedit_pageseg_mode", "10") and hoping it is the same.

ccouzens commented 1 year ago

Hi, I might be able to get some time at the weekend to look into this.

Notes for myself: https://docs.rs/tesseract-sys/0.5.14/tesseract_sys/fn.TessBaseAPISetPageSegMode.html https://tesseract-ocr.github.io/tessapi/5.x/a02438.html#a15a7a9c1afbba3078a55b4566de891ab TessPageSegMode_PSM_AUTO TessPageSegMode_PSM_AUTO_ONLY TessPageSegMode_PSM_AUTO_OSD TessPageSegMode_PSM_CIRCLE_WORD TessPageSegMode_PSM_COUNT TessPageSegMode_PSM_OSD_ONLY TessPageSegMode_PSM_RAW_LINE TessPageSegMode_PSM_SINGLE_BLOCK TessPageSegMode_PSM_SINGLE_BLOCK_VERT_TEXT TessPageSegMode_PSM_SINGLE_CHAR TessPageSegMode_PSM_SINGLE_COLUMN TessPageSegMode_PSM_SINGLE_LINE TessPageSegMode_PSM_SINGLE_WORD TessPageSegMode_PSM_SPARSE_TEXT TessPageSegMode_PSM_SPARSE_TEXT_OSD

ccouzens commented 1 year ago

There's a tessedit_pageseg_mode which might do the same thing? For now using .set_variable("tessedit_pageseg_mode", "10") and hoping it is the same.

I think you're right about this. The documentation says

The mode is stored as an IntParam so it can also be modified by ReadConfigFile or SetVariable("tessedit_pageseg_mode", mode as string).

olalonde commented 1 year ago

Cool, so I guess this API is not really necessary then.

ccouzens commented 1 year ago

None the less, I've made a PR

https://github.com/antimatter15/tesseract-rs/pull/36/files