dipu-bd / lightnovel-crawler

Generate and download e-books from online sources.
https://pypi.org/project/lightnovel-crawler/
GNU General Public License v3.0
1.51k stars 293 forks source link

https://www.wuxia.blog/ #595

Closed SirGryphin closed 2 years ago

SirGryphin commented 4 years ago

Tried making it myself but I can't scrape all chapters because you need to click "Show More" button to make it run a javascript to list all chapters in source code. I had a look into using requests and selenium, but it's a bit too advanced for me.

I did manage to find chapter list page (number is book id) Chapter list: https://www.wuxia.blog/temphtml/_tempChapterList_all_12.html

Main book page: https://www.wuxia.blog/novel/hfg76gugftvt

If someone knows how to get chapters from that. I think I've seen something similar done for a different source file but it's too advanced for me.

dipu-bd commented 4 years ago

They are using POST call to retrieve the next chapters. I would do something like:

next_url = 'https://www.wuxia.blog/temphtml/_tempChapterList_all_12.html'
response = self.submit_form(next_url)
soup = self.make_soup(response)
idMysteries commented 2 years ago

It was easy. :cat: