kemayo / leech

Turn a story on certain websites into an ebook for convenient reading
MIT License
154 stars 24 forks source link

Made arbitrary sites no longer leak memory and fixed worm epub. #51

Closed IdanDor closed 3 years ago

IdanDor commented 3 years ago

Each Chapter object had a reference to the entire page tree, meaning that the program rose in RAM usage by a lot. I reached 2 GB of ram trying to download all of the practical guide into a single epub - and then crashed on the second to last chapter due to MemoryError.

Transformed Worm to be with next_selector so the chapters are correctly ordered, E.2 is not skipped and the download does not crush due to ?share=twitter url matched before. Fixed Worm titles - now all chapters have a title.

If you are interested, I can also include json examples for Pact, Pale, Twig, the entire Guide and Unsong.