KaniyamFoundation / ProjectIdeas

A Place to write down the project ideas and to plan them
39 stars 3 forks source link

Create a huge collection of all tamil books metadata #228

Open tshrinivasan opened 3 weeks ago

tshrinivasan commented 3 weeks ago

There are many websites with tamil books details

etc

list all the websites that have all the books details.

scrap them all the data and publish.

create a portal with all the book's metadata.

Once the data is collected, we can publish as a website with Omeka S https://omeka.org/s/ or Islandora https://www.islandora.ca/

Natkeeran commented 3 weeks ago

Some of the current datasets can be found here:

https://github.com/KaniyamFoundation/tamilbooks_metadata/tree/main/data

kjanani27 commented 3 weeks ago

https://archive.org/ www.storytel.com https://uvesalibrary.org/ www.goodreads.com https://kannadhasanpathippagham.com https://www.alliancebook.com https://tamilvanan.com/

amotbeli commented 3 weeks ago

https://www.noolulagam.com/ https://dialforbooks.in/ https://www.annacentenarylibrary.org/pages/view/library-catalogue

HariharanUmapathi commented 2 weeks ago

I'm Scrapping text from https://www.projectmadurai.org/ git hub link comming soon....

Edit 1: Github : https://github.com/HariharanUmapathi/programmerlife/tree/kaniyam-book-list-scraper/Python/book-list-scrapper

Free free to create github issues regarding the code

rajkannan1978 commented 2 weeks ago

I am scraping book details from https://www.panuval.com Soon will create git hub link.

amotbeli commented 2 weeks ago

I'm scraping data from the Anna Centenary Library catalogue.

GitHub link is here.

rajkannan1978 commented 1 week ago

I am scraping data from Panuval book store. Posted my code to git. GitHub link is https://github.com/rajkannan1978/web-scraping.git Please let me know if any bugs there. I welcome any suggestions to improve the code.

rajkannan1978 commented 1 week ago

Hello, Posting some of my python projects are here. Web Scraping https://github.com/rajkannan1978/web-scraping.git Grocery https://github.com/rajkannan1978/grocery.git Number Guess Game https://github.com/rajkannan1978/number_guess_game.git

Thanks.

rajkannan1978 commented 1 day ago

Hi, Got 18708 books from panuval.com Web Scraping https://github.com/rajkannan1978/web-scraping.git

amotbeli commented 18 hours ago

Got 15845 books from the Anna Centenary Library catalogue.

See here.

rajkannan1978 commented 9 hours ago

Super. I visited the website. Also learned from your code. It is neat and clean.

Thanks.

On Thu, Sep 12, 2024 at 7:15 AM amotbeli @.***> wrote:

Got 15845 books from the Anna Centenary Library catalogue.

See here. https://github.com/amotbeli/acl_data/blob/main/acl_data.json

— Reply to this email directly, view it on GitHub https://github.com/KaniyamFoundation/ProjectIdeas/issues/228#issuecomment-2345089320, or unsubscribe https://github.com/notifications/unsubscribe-auth/BKZNVCPFE2TGLSQWIZJRIW3ZWDW3FAVCNFSM6AAAAABM4G32EWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNBVGA4DSMZSGA . You are receiving this because you commented.Message ID: @.***>

amotbeli commented 4 hours ago

Thank you, rajkannan1978!