dinosauria123 / gcv2hocr

gcv2hocr converts from Google Cloud Vision OCR output to hocr to make a searchable pdf.
99 stars 33 forks source link

Letter position does not correct in Japanese vertical text #14

Open dinosauria123 opened 6 years ago

dinosauria123 commented 6 years ago

In the case of gcv2hocr.py, vertical Japanese text invisible fonts positions are bottom of the PDF document.

In gcv2hocr (C version) does better output than gcv2hocr.py but it use CR after every words.

I have to study how to deal with Japanese vertical text in ReportLab (pdf library written by python) which used in hocr-pdf tools.