dinosauria123 / gcv2hocr

gcv2hocr converts from Google Cloud Vision OCR output to hocr to make a searchable pdf.
99 stars 33 forks source link

Build Detailed HOCR file using FulltextAnnotation Block of GCV Response #36

Closed catabre closed 4 years ago

catabre commented 4 years ago

Modified the gcv2hocr.py to parse the fullTextAnnotation block and accordingly generate a rich hocr file having ocr_page ocr_carea ocr_par ocr_line ocrx_word tags.

dinosauria123 commented 4 years ago

Thank you for your commit !