tshrinivasan / OCR4wikisource

OCR for WikiSource using Google Drive OCR
GNU General Public License v2.0
33 stars 24 forks source link

dont overwrite headers and footers #41

Open bodhisattwawiki opened 8 years ago

bodhisattwawiki commented 8 years ago

We can add headers and footers at the index page to add automatically in all pages For example, see https://bn.wikisource.org/w/index.php?title=নির্ঘণ্ট:বিশ্বকোষ_প্রথম_খণ্ড.djvu&action=edit , at the end, there is হেডার (Header) and (Footer) section. The script is over-writing this previously added headers ans footers. We dont need this.

tshrinivasan commented 8 years ago

Explain in details with example.

Are you updating any existing page with the mediawiki_uploader script?

it just pastes the content from the OCR. It does not check for any existing content.

Do you want to do a check for existing content?

What do you want the script if it finds any existing content for any page?

bodhisattwawiki commented 8 years ago

There is an option to put header and footer in every index pages. See https://bn.wikisource.org/w/index.php?title=নির্ঘণ্ট:বিশ্বকোষ_প্রথম_খণ্ড.djvu&action=edit , at the end, there is হেডার (Header) and (Footer) section. If you write something there in those boxes, it will be automatically added to all pages of that book by default, so, it saves a lot of time while proof-reading, because, you dont have to put header and footer in all pages manually. But this script is over-writing those sections, so we have to put header and footer in all pages manually while proof-reading. If the script dont includes the header and footer, then I think, it will be ok.

tshrinivasan commented 8 years ago

The script does not add any header or footer.

if you see that it adds them, share some example pages to compare and analysis.

jayantanth commented 8 years ago

I am searching at mediawiki help for this issue, how we can add the OCRed content at main field except header & footer. Have no clue. We need help from original developer User:Tpt

bodhisattwawiki commented 8 years ago

@tshrinivasan , The script does not add headers and footers but it is over writing existing headers and footers from the index file

jayantanth commented 8 years ago

Hi, Shrini, Could you please check the https://phabricator.wikimedia.org/T30894 for this issue?