Epubs seem to have very limited consistency with how they organize their internal file structure.
I haven't figured out a great way of finding the image folder.
images = soup.find_all('img')
if images:
for img in images:
img["loading"] = "lazy"
filename = img['src']
filename = filename.replace("../", path+"/")
if not os.path.exists(filename):
filename = f"{path}/{img['src']}"
if not os.path.exists(filename):
filename = f"{path}/EPUB/media/{img['src']}"
if not os.path.exists(filename):
filename = f"{path}/EPUB/images/{img['src']}"
if not os.path.exists(filename):
filename = img['src']
filename = filename.replace("../", path+"/OEBPS/")
Whenever I encounter an epub whose images don't load, I need to load up the epub, look at the folder structure and then manually add in the branching path.
I'm sure there's a better way to search the epub to locate the image folder itself which would work for any yet undiscovered filepaths.
Epubs seem to have very limited consistency with how they organize their internal file structure.
I haven't figured out a great way of finding the image folder.
Whenever I encounter an epub whose images don't load, I need to load up the epub, look at the folder structure and then manually add in the branching path.
I'm sure there's a better way to search the epub to locate the image folder itself which would work for any yet undiscovered filepaths.