Open balajijagadesh opened 4 years ago
I believe all books from Chennai Museum are already present in archive.org. https://archive.org/details/@malamud did this already, it seems. I randomly checked few book titles from Chennai Museum and verified in archive.org and they are present. Kindly check. If some more tasks to be done on this, Tamilvelan (Payilagam Python trainee) was earlier asked to do this. His code is present here - https://tamilvelanpython.wordpress.com/2020/07/06/web-scraping-project/
Tamil Nadu government has uploaded the museum related books released by Tamil Nadu government in the website
http://www.e-books-chennaimuseum.tn.gov.in/chennaimuseum/index.php?option=com_content&view=article&id=18&Itemid=116
Here the books are uploaded in alphabetical order.
Need to identify the structure of the url.
Download all the books locally with relevant Meta data.
Then upload it with in Internet archive with the license creative commons cc by sa as per this government order. While uploading the books need to be uploaded with proper meta data for easy access in the future. Also can explore the possibility of adding an ocr layer to the pdf before uploading.
https://commons.wikimedia.org/wiki/File:GoTN_Tamil_Development_Departments_order_on_creative_commons_cc_by_sa.pdf
Once uploaded into internet archive, then it can be easily transferred to commons.wikimedia.org using the tool
https://tools.wmflabs.org/ia-upload/
in the later stage.