kemayo / leech

Turn a story on certain websites into an ebook for convenient reading
MIT License
154 stars 24 forks source link

Support partial updates #63

Open codetheweb opened 3 years ago

codetheweb commented 3 years ago

My goal is to have a folder of auto-updating ebooks (cron job). I saw that there's a --cache flag, but even running with the cache on there's a lot of unnecessary processing for any download after the initial one. Would it be possible to add some kind of partial update mode, where if an .epub already exists it checks for and downloads just 1-2 chapters instead of re-assembling the whole book?

I would be happy to add this myself, just want to hear your thoughts and where to begin implementing this.

(Also, you should setup GitHub sponsors / Buy Me a Coffee or something. Would be happy to throw a few bucks your way and I'm sure other folks would too. 😄)

mathiasfoster commented 3 years ago

+1

kemayo commented 2 years ago

I have been holding off on this because I think it's sort of complicated. To brain-dump what I think the complexities are:

codetheweb commented 2 years ago

Yeah, after opening this I realized it would probably be a lot more complicated than I thought at first. I think using some kind of intermediate storage like an SQLite database or something might work better than trying to read back data from the generated epub.

Given that I really only need to update books once a day at most, I think just using the cache for now / scraping directly from the web works well enough for now.

mathiasfoster commented 2 years ago

I've put together a scraper that works off RSS feeds — downloads the content, turns into MOBI, and emails to my Kindle. Still needs a bit of work before it's ready to be open sourced unfortunately!

From a user perspective (if this was ever to be integrated into leech) it would make more sense (for my use case) for each chapter to be converted into a new EPUB, rather than altering the combined EPUB to integrate the new chapter.