dinosauria123 / gcv2hocr

gcv2hocr converts from Google Cloud Vision OCR output to hocr to make a searchable pdf.
99 stars 33 forks source link

Support Document text detection #11

Open dinosauria123 opened 7 years ago

dinosauria123 commented 7 years ago

When posting OCR request, we can choose two type of response.

A TEXT_DETECTION response includes the detected phrase, its bounding box, and individual words and their bounding boxes:

A DOCUMENT_TEXT_DETECTION response includes additional layout information, such as page, block, paragraph, word, and break information.

gcvocr.sh now uses TEXT_DETECTION. But it will change to DOCUMENT_TEXT_DETECTION to improve text quality.

catabre commented 4 years ago

I have raised the Pull request for the same.