Open tshrinivasan opened 7 years ago
Can you elaborate on step 6 "OCR4Wikisource should read the queue, OCR it and paste in wikisource" — does this mean the tool itself would add the text to the relevant page on Wikisource? Or the user would copy and paste the text there?
What differences in workflow or features are there with respect to the proofreadpage system of proofreading a page at a time within wikisource?
I'm wondering if the ws-google-ocr tool could be modified to selectively either use the Vision API or the Drive system of OCR.
1) Yes, the script itself adds the texts to relevant pages. Users dont have to do it manually.
2) This script also does OCR one book at a time in contrast to the existing OCR (Phe or ws-google-ocr) system, where single page is OCRed at a time.
@samwilson , we have a test file for Bengali Wikisource. Please feel free to test with it using OCR4Wikisource script.
The OCR4Wikisource is a python script that runs only on GNU/Linux and in commandline. Many new users are feeling tough to setup and execute this.
A web version of the same tool is required, so that any new user can use it easily via browser.
Requirements
Can anyone volunteer for creating a web version?