Closed Gearme closed 3 years ago
Thank you for your contribution Gearme.
Looks good to me, and I'm happy to merge in principle.
I'm looking into the CI failure. I suspect the version of tesseract used in the ci is too old to have the 4 functions in the c API (eg TessBaseAPIGetAltoText).
@Gearme , I hope to get https://github.com/houqp/leptess/pull/30 merged to fix your CI issue. Once that's merged, I'll look at getting this merged.
Hi @Gearme , I've merged #30. When you have a moment could you please either merge with the master branch or rebase against it.
If you need help, I may be able to do it for you.
I apologize for my tardiness - the changes have been merged now.
One side note: Locally, I've bumped the dependencies to leptonica-sys and tesseract-sys to their latest and they work fine. Didn't inlude them in this merge request though, since they're not required.
All merged and package @Gearme
0.11.0 should be good to go for you https://crates.io/crates/leptess/0.11.0
Thanks for the PR :)
Thanks @Gearme for the new methods and tests indeed!
I've implemented get_alto_text, get_tsv_text, get_lstm_box_text and get_word_str_box_text, they work pretty much like get_hocr_text. Also implemented tests for them, adding regex to dev-dependencies for testing of output formats.