ocropus / hocr-tools

Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.
Other
364 stars 79 forks source link

1.2.1 #119

Closed zvezdochiot closed 6 years ago

stweil commented 6 years ago

@zvezdochiot, thank you for your ideas. I think it would help if you could provide smaller pull requests which address individual topics, because that makes the discussion easier and the chances for getting merged are higher. People also expect a subject line which is more descriptive.

zvezdochiot commented 6 years ago

To me from this toolkit it is necessary for direct work only hocr-pdf. But not in the form in which it is proposed in the main branch. Compression jpeg in most cases is not suitable. To work with tiff, I use my branch + (pdfwatermark (python,PyPDF2) | pfbgmrgr (python,pyPdf)).