robertknight / tesseract-wasm

JS/WebAssembly build of the Tesseract OCR engine for use in browsers and Node
https://robertknight.github.io/tesseract-wasm/
BSD 2-Clause "Simplified" License
258 stars 26 forks source link

Add option to get output in hOCR format #72

Closed robertknight closed 1 year ago

robertknight commented 1 year ago

Add methods to OCRClient and OCREngine to get output in hOCR format, for use with existing OCR tools, and add a dropdown menu in the demo output to allow switching between plain text and hOCR formats.