About:

Python Script To Copy WuxiaWorld Chapters Into EPUB File.

Copies The Novel Chapters Along With Novel Details And Sometimes(Once Every 6-10 Times Code Is Run) 'Not' The Cover Image (IDK Why ? Maybe Because Of BeautifulSoup4 Internal Problem).

How Does The Script Work ? Just Enter The Novel URL Inside The Script And The Rest Follows.

I'll Try To Add Any Necessary Updates.

Initial Implementation By : Aundinn

Note :

Check this other novel webiste: https://wxuiaworld.co. Ask Me, Why This Website? Well, It Has Novels From Webnovel(Qidan) & WuxiaWorld With All Latest Chapters Unlocked. No Spirit Stones, No Patreon, No Subscription Or Any Of Those Things Required To Read The Latest Chapters! Don't Take My Word For It ? Check It Out.

Task(s) :

[x] Get List Of Chapters From Novel Website And Use Links From The List Rather Than Progress Sequentially Because Of The Arising Problem Of Some Pages Not Having Sequential Names.
[ ] Implement multiprocessing to speed up process.

Problem(s) :

None Yet(Report if any).

Screenshot :

Image Not Avialable

Documentation :

For Beginners, After Setting Up A Working Python 3 Environment(Along With Latest pip), You Need To Install Some Packages. To Install, Run These Commands In Your CMD/Terminal :
- pip3 install bs4
- pip3 install ebooklib
- pip3 install requests
- pip3 install html5lib=="0.9999999"
Download The Python Script And Unzip It.
Open The Script With A Text Editor And Read The Details Inside.
In Case The Script Was Not Updated According To The Changes In Website, You Might Refer The BeautifulSoup Docs To Make Changes Accordingly.
To Run, Open CMD/Terminal, Navigate To The Unzip Location And Type :
- Linux -python3 code.py
- Windows - python code.py or py code.py
EPUB File Will Be Saved At The Location Of Script.

Working :

Set Novel Link in novelURL
- Example - https://www.wuxiaworld.com/novel/martial-god-asura | https://www.wuxiaworld.com/novel/a-will-eternal
If Specific No. Of Chapters Are To Be Downloaded, Then Enter 2 And Provide The start And end Chapters.
EPUB File Will Be Saved In The Format NovelName_start-chapter_end-chapter.epub

Parsing :

html5lib Is Used Because Although Being Tiny Winy Bit Slow, It Generates Valid HTML. You May Compare Others Here, Differences Between Parsers. I've Copied The Table From BS4 Website Below To Give A Faint Overview.

Parser	Typical usage	Advantages	Disadvantages
Python’s html.parser	`BeautifulSoup(markup, "html.parser")`	Batteries included Decent speed Lenient (as of Python 2.7.3 and 3.2.)	Not very lenient (before Python 2.7.3 or 3.2.2)
lxml’s HTML parser	`BeautifulSoup(markup, "lxml")`	Very fast Lenient	External C dependency
lxml’s XML parser	`BeautifulSoup(markup, "lxml-xml")` `BeautifulSoup(markup, "xml")`	Very fast The only currently supported XML parser	External C dependency
html5lib	`BeautifulSoup(markup, "html5lib")`	Extremely lenient Parses pages the same way a web browser does Creates valid HTML5	Very slow External Python dependency

If Any Problem Occurs With `html5lib` :

In Case You Update It Accidentally, You Can Reinstall The Specific Version By Checking The Details For Beginners.
Another Choice, Change html5lib To lxml - If Installed, Otherwise To Python's Inbuilt html.parser .

1ycx / WuxiaWorld

readme

About:

I'll Try To Add Any Necessary Updates.

Initial Implementation By : Aundinn

Note :

Task(s) :

Problem(s) :

Screenshot :

Documentation :

Working :

Parsing :

If Any Problem Occurs With `html5lib` :

License

1ycx / WuxiaWorld

readme

About:

I'll Try To Add Any Necessary Updates.

Initial Implementation By : Aundinn

Note :

Task(s) :

Problem(s) :

Screenshot :

Documentation :

Working :

Parsing :

If Any Problem Occurs With html5lib :

License

If Any Problem Occurs With `html5lib` :