domenic / worm-scraper

Scrapes the web serial Worm, its sequel Ward, and the bridge series Glow-worm into an ebook format
Other
210 stars 48 forks source link
ebook ebook-downloader epub glow-worm parahumans ward worm

Worm Scraper

Scrapes the web serial Worm, its sequel Ward, and the bridge series Glow-worm into an ebook format.

How to use

First you'll need a modern version of Node.js. The earliest version tested is v20.16.0.

Then, open a terminal (Mac documentation, Windows documentation) and install the program by typing

npm install -g worm-scraper

This will take a while as it downloads this program and its dependencies from the internet. Once it's done, try to run it, by typing:

worm-scraper --help

If this outputs some help documentation, then the installation process went smoothly. You can move on to assemble the ebook by typing

worm-scraper

This will take a while, but will eventually produce a Worm.epub file!

If you'd like to get Ward instead of Worm, use --book=ward, e.g.

worm-scraper --book=ward

Similarly, for Glow-worm:

worm-scraper --book=glow-worm

Reading EPUBs on Amazon Kindle

EPUBs are not the native format for Amazon Kindle devices and apps. However, you can send them to your Kindle library by following Amazon's instructions.

Text fixups

This project makes a lot of fixups to the original text, mostly around typos, punctuation, capitalization, and consistency. You can get a more specific idea of what these are via the code; there's convert-worker.js, where some things are handled generally, and substitutions.json, for one-off fixes.

This process is designed to be extensible, so if you notice any problems with the original text that you think should be fixed, file an issue to let me know, and we can update the fixup code so that the resulting ebook is improved. (Or better yet, send a pull request!)