Python Script To Copy WuxiaWorld Chapters Into EPUB File.
Copies The Novel Chapters Along With Novel Details And Sometimes(Once Every 6-10 Times Code Is Run) 'Not' The Cover Image (IDK Why ? Maybe Because Of BeautifulSoup4 Internal Problem).
How Does The Script Work ? Just Enter The Novel URL Inside The Script And The Rest Follows.
Check this other novel webiste: https://wxuiaworld.co. Ask Me, Why This Website? Well, It Has Novels From Webnovel(Qidan) & WuxiaWorld With All Latest Chapters Unlocked. No Spirit Stones, No Patreon, No Subscription Or Any Of Those Things Required To Read The Latest Chapters! Don't Take My Word For It ? Check It Out.
For Beginners, After Setting Up A Working Python 3 Environment(Along With Latest pip
), You Need To Install Some Packages. To Install, Run These Commands In Your CMD/Terminal :
pip3 install bs4
pip3 install ebooklib
pip3 install requests
pip3 install html5lib=="0.9999999"
Download The Python Script And Unzip It.
Open The Script With A Text Editor And Read The Details Inside.
In Case The Script Was Not Updated According To The Changes In Website, You Might Refer The BeautifulSoup Docs To Make Changes Accordingly.
To Run, Open CMD/Terminal, Navigate To The Unzip Location And Type :
python3 code.py
python code.py
or py code.py
EPUB File Will Be Saved At The Location Of Script.
novelURL
start
And end
Chapters.NovelName_start-chapter_end-chapter.epub
html5lib
Is Used Because Although Being Tiny Winy Bit Slow, It Generates Valid HTML. You May Compare Others Here, Differences Between Parsers.
I've Copied The Table From BS4 Website Below To Give A Faint Overview.
Parser | Typical usage | Advantages | Disadvantages |
Python’s html.parser | BeautifulSoup(markup, "html.parser") |
|
|
lxml’s HTML parser | BeautifulSoup(markup, "lxml") |
|
|
lxml’s XML parser | BeautifulSoup(markup, "lxml-xml")
BeautifulSoup(markup, "xml") |
|
|
html5lib | BeautifulSoup(markup, "html5lib") |
|
|
html5lib
:html5lib
To lxml
- If Installed, Otherwise To Python's Inbuilt html.parser
.Copyright © 2018 Kogam22. Released under the terms of the Apache 2.0 license.