tshrinivasan / OCR4wikisource

OCR for WikiSource using Google Drive OCR
GNU General Public License v2.0
33 stars 24 forks source link

How to process a file with some multi-column landscape pages #98

Open Shreeshrii opened 6 years ago

Shreeshrii commented 6 years ago

While the main pages of pdf are portrait and single column, there are some table of contents, index pages which are two-column, portrait. There are other multi-column landscape pages.

Is is possible to process page ranges from same pdf file and specify orientation and number of columns for same?

balajijagadesh commented 6 years ago

As from my experience, the output from OCR4wikisource is an OCR'ed output column wise. Suppose there are 2 columns then first column is ocr'ed, then followed by second column.