Open tshrinivasan opened 3 years ago
Hi,
kind of dirty code, please get the codebase
https://github.com/manimaran990/bookscrap
On Wed, Oct 28, 2020 at 6:32 AM Shrinivasan T notifications@github.com wrote:
http://www.noolulagam.com/books/
This site has 3877 pages, 10 books each page.. i.e 38770 books info http://www.noolulagam.com/books/1/ http://www.noolulagam.com/books/3877/
Scrap each page and get the below details as a CSV file
நூல் பெயர், வகை, எழுத்தாளர், பதிப்பகம், விலை, ஆண்டு
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/KaniyamFoundation/ProjectIdeas/issues/124, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAOX3C5DQBSRROUXG27SIULSM7XNPANCNFSM4TCED5FA .
-- Regards, Manimaran G
updated the code to fetch the title from h4 tag
https://www.dropbox.com/s/2ac1taj445yihrw/complete.csv?dl=0
Here is the result csv file.
It needs more cleaning.
Thanks @manimaran990
Hi sir, I'm new to python. notify me issues in code. waiting to resolve. https://github.com/stephenraj314/Bs4scraping
Stephen is enthusiastic and passionate buddy python developer. It is first try. Update him changes needed in his code. He will solve the issues.
Thanks @muthu1809 and @stephenraj314
I reported an issue at the repo's issue section.
Hi sir, scrapped book details using selenium please check codebase https://github.com/stephenraj314/SeleniumwebScraping.git
http://www.noolulagam.com/books/
This site has 3877 pages, 10 books each page.. i.e 38770 books info http://www.noolulagam.com/books/1/ http://www.noolulagam.com/books/3877/
Scrap each page and get the below details as a CSV file
நூல் பெயர், வகை, எழுத்தாளர், பதிப்பகம், விலை, ஆண்டு