Open ravidreams opened 8 years ago
Do you want to do the following?
Do you mean these or something else?
It is already not uploading when the page is there (or the code / wiki doesn't let overwrite when a page is already there). But the terminal message says, it is uploaded. Just the message needs to be changed.
I just installed ocr4wikisource and find it is very convenient for updating the OCRed text on wikisource. Thank you for the tool.
I wanted to know whether there is any option which will allow overwriting of the pages already on wikisource.
It will overwrite as default.
Can you explain with examples?
2017-10-14 9:03 GMT+05:30 Shreeshrii notifications@github.com:
I just installed ocr4wikisource and find it is very convenient for updating the OCRed text on wikisource. Thank you for the tool.
I wanted to know whether there is any option which will allow overwriting of the pages already on wikisource.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/tshrinivasan/OCR4wikisource/issues/17#issuecomment-336606680, or mute the thread https://github.com/notifications/unsubscribe-auth/ABNbOPQx9FBKEb8xsMhdIhSExPxc-Jtfks5ssCufgaJpZM4G-Ek4 .
-- Regards, T.Shrinivasan
My Life with GNU/Linux : http://goinggnu.wordpress.com Free E-Magazine on Free Open Source Software in Tamil : http://kaniyam.com
Get Free Tamil Ebooks for Android, iOS, Kindle, Computer : http://FreeTamilEbooks.com
and then uploaded the Google Drive OCRed pages using OCR4wikisource linked at https://sa.wikisource.org/wiki/%E0%A4%85%E0%A4%A8%E0%A5%81%E0%A4%95%E0%A5%8D%E0%A4%B0%E0%A4%AE%E0%A4%A3%E0%A4%BF%E0%A4%95%E0%A4%BE:%E0%A4%B6%E0%A5%8D%E0%A4%B0%E0%A5%80%E0%A4%A4%E0%A4%A4%E0%A5%8D%E0%A4%B5%E0%A4%A8%E0%A4%BF%E0%A4%A7%E0%A4%BF.pdf
However, later I noticed that some pages are in landscape format or skewed. So I want to OCR and upload them again.
Also, I had OCRed a few of these pages on wikisource website using their Google OCR button and was wondering whether they would get overwritten.
Ok, I found a related problem. When I had generated the index page with <pagelist />
in sa.wikisource, it had generated the page numbers in Devanagari digits.
I edited some pages eg. 1, 107 using the wikisource edit and OCR feature.
When I uploaded pages using OCR4wikisource it created the pagenumbers using 0-9 and not devanagari ०-९.
Hence, there are two versions for page 107 https://sa.wikisource.org/w/index.php?title=%E0%A4%AA%E0%A5%83%E0%A4%B7%E0%A5%8D%E0%A4%A0%E0%A4%AE%E0%A5%8D:%E0%A4%B6%E0%A5%8D%E0%A4%B0%E0%A5%80%E0%A4%A4%E0%A4%A4%E0%A5%8D%E0%A4%B5%E0%A4%A8%E0%A4%BF%E0%A4%A7%E0%A4%BF.pdf/%E0%A5%A7%E0%A5%A6%E0%A5%AD&action=history
and the page did not get overwritten.
I intentionally tried uploading text for pages that already exist as many test books are having partial proofread activity.
It gives the following message:
Moving the file text_for_page_00010.txt to the folder temp-2016-01-04-19-09-23
Uploading content for text_for_page_00011.txt Uploaded at https://ta.wikisource.org/wiki/Page:கலைக்_களஞ்சியம்_அம்மாலன-அரேபியா.pdf/11
Whereas, it just skips the page if it is already there. May be the message should read like:
Page https://ta.wikisource.org/wiki/Page:கலைக்_களஞ்சியம்_அம்மாலன-அரேபியா.pdf/11 already exists. Skipping upload.
I am still not sure if a page overwrite is possible and an option can be given for that if we are repeating uploads after an error (page number variation).
\
Consider this least priority as the tool is working anyway ;)