aerkalov / ebooklib

Python E-book library for handling books in EPUB2/EPUB3 format -
https://ebooklib.readthedocs.io/
GNU Affero General Public License v3.0
1.49k stars 231 forks source link

broken epub (zip) #260

Open mprivoro opened 2 years ago

mprivoro commented 2 years ago

Hi,

If one or more files in epub file are broken

Bad CRC-32 for file 'OPS/images/CrackInCreation-14.jpg'

file could not be parsed, is there a way to read ignoring the zip errors? for my example, the error is only in picture, the text and metadata are ok...

aerkalov commented 2 years ago

Hi!

Nothing out of the box but maybe you can try something like this. Did not test it because I can't find broken ZIP file, but something like this might help.

import posixpath as zip_path
from ebooklib.epub import EpubReader

class MyReader(EpubReader):
    def read_file(self, name):
        # Raises KeyError
        try:
            name = zip_path.normpath(name)
            return self.zf.read(name)
        except:
            return ''

reader = MyReader("book.epub", None)

book = reader.load()
reader.process()