-
Hello,
I need to preprocess scanned images in order to improve the following OCR quality. I already able to remove horizontal and vertical lines with Leptonica(Java lept4j) library.
I just wonde…
-
dhSegment is AWESOME and EXACTLY what my wife and I need for our post-cancer #PayItForward Bonus Round activity doing grassroots #CitizenScience #digitalhumanities research in support of eResearch and…
-
1. In the data for the individual entries (“Bibliographic Resource”) the "contributors" are not bold, like the other categories.
2. For the data re-use, the contributor name should be standardized. S…
-
-
gImageReader seems to have problems with hypens. While OCRFeeder recognizes hypenated words and put them together, GIR keeps them separated, keeping the hypen. Is this some configuration stuff? If yes…
-
Ideally this work should match the interface of PF.
``f(html1: string, html2: string) -> dict`` where the output dict has the same keys as the PF result (or a superset of those keys)
-
With LSTM training the dictionary dawg files have become optional. In light of this, I want to suggest an additional traineddata file for Devanagari script, which can cater to all main languages writt…
-
http://www.dhgarrette.com/papers/garrette_ocr_naacl2016.pdf
https://github.com/tberg12/ocular
-
getting following errors and no resolution displayed for the SnapScan scaner:
```
paperwork
** (paperwork:2568): WARNING **: Couldn't connect to accessibility bus: Failed to connect to socket /tmp/…
-
This looks really nice. Thank you for putting this open.
I am attempting to do OCR.
I can identify all the letters, but then i need to check them against a word list so i can pick up where the OCR…