lorenzodifuccia / safaribooks

Download and generate EPUB of your favorite books from O'Reilly Learning (aka Safari Books Online) library.
Do What The F*ck You Want To Public License
4.65k stars 691 forks source link

images in pulled book are not showing #302

Closed digitalw00t closed 2 years ago

digitalw00t commented 2 years ago

All images in the books I pull are not in the epub. I do see in the same directory an OEBPS folder, with an images folder in there. And I do see images in there. All the images are 30k in size, and the ubutnu image viewer says they are all an "unknown" image type.

victormeloufrgs commented 2 years ago

The parameter received asset_base_url is outdated, and that's the reason the images are not being found. It is a simple change in the script to fix it: just need to get the new URL (open a book in the website, right-click on it then copy image URL) and replace in code the usage of asset_base_url by the new format.

RenanSPLopes commented 2 years ago

As said by @victormeloufrgs changing the script in this way solve the problem, is not ideal but helps:

            new_base_url = "https://learning.oreilly.com/api/v2/epubs/urn:orm:book:9781492086888/files/assets"
            if "images" in next_chapter and len(next_chapter["images"]):
                self.images.extend(urljoin(new_base_url, img_url)
                                   for img_url in next_chapter['images'])

You only need to change the id of the book in the URL to the id that you wants

digitalw00t commented 2 years ago

I'll do some testing and see if this fixes it.

On Wed, Dec 15, 2021 at 6:34 AM RenanSPLopes @.***> wrote:

As said by @victormeloufrgs https://github.com/victormeloufrgs changing the script in this way solve the problem, is not ideal but helps:

        new_base_url = "https://learning.oreilly.com/api/v2/epubs/urn:orm:book:9781492086888/files/assets"
        if "images" in next_chapter and len(next_chapter["images"]):
            self.images.extend(urljoin(new_base_url, img_url)
                               for img_url in next_chapter['images'])

You only need to change the id of the book in the URL to the id that you wants

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/lorenzodifuccia/safaribooks/issues/302#issuecomment-994749714, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEQZFODB7EOIMOFBGYOLKTURCDN5ANCNFSM5JVRUM5A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

bipin-nag commented 2 years ago

My working solution based on snippet from @RenanSPLopes asset_base_url was slightly different and needed to sub book_id as param

# Images
asset_base_url = "https://learning.oreilly.com/api/v2/epubs/urn:orm:book:%s/files/" % self.book_id
if "images" in next_chapter and len(next_chapter["images"]):
    self.images.extend(urljoin(asset_base_url, img_url)
                        for img_url in next_chapter['images'])
digitalw00t commented 2 years ago

Looks like images are in there, having another issue. I'll put in a seperate issue report fo it. Closing this one. Thanks for the assist.