nicohaenggi / SafariBooks-Downloader

a SafariBooksOnline downloader that generates respective .epub books for offline and kindle reading
https://learning.oreilly.com/
MIT License
514 stars 106 forks source link

Download images #1

Open agroferia opened 7 years ago

agroferia commented 7 years ago

Hi, great work :) I wanted to ask if there is the possibility to embed the images of the book into the epub file?

agroferia commented 7 years ago

Sorry, I saw that the images were downloaded but somehow calibre does not want to display them

agroferia commented 7 years ago

Ok I found out that by default SafariBooks-Downloader fetches the html file and generates tags for image like: <p class="images/xxxx.jpg" alt="image"></p> But it should rather be something like <p class="image"><img src="images/xxxx.jpg" alt="image"></p>

PD: I have edited the code above 1000 times

nicohaenggi commented 7 years ago

@agroferia Hi, thanks for hitting me up, I've been away the past few weeks. Yeah exactly, I still lack the support for a good HTML to XHTML converter, as this is required by the epub specification. The problem by SafariBooks is, that their HTML is kind of messy, so lots of HTML validation errors and missing closing tags, hence making it very hard to properly convert to XHTML