aerkalov / ebooklib

Python E-book library for handling books in EPUB2/EPUB3 format -
https://ebooklib.readthedocs.io/
GNU Affero General Public License v3.0
1.48k stars 231 forks source link

How to handle Epub read encountering "Bad Zip file" error? #228

Open actualizeai opened 3 years ago

actualizeai commented 3 years ago

Hi!

I am using ebooklib to read epub.images files from project gutenberg. For most .epub files, i am able to open & read the files fines.

But intermittently, i am getting error: "Bad Zip file". The trace is as under: File "/Users/jaideepadhvaryu/.pyenv/versions/3.9.2/lib/python3.9/site-packages/ebooklib/epub.py", line 1686, in _load self.zf = zipfile.ZipFile(self.file_name, 'r', compression=zipfile.ZIP_DEFLATED, allowZip64=True) File "/Users/jaideepadhvaryu/.pyenv/versions/3.9.2/lib/python3.9/zipfile.py", line 1257, in init self._RealGetContents() File "/Users/jaideepadhvaryu/.pyenv/versions/3.9.2/lib/python3.9/zipfile.py", line 1324, in _RealGetContents raise BadZipFile("File is not a zip file") zipfile.BadZipFile: File is not a zip file

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/Users/jaideepadhvaryu/Documents/PracticeCodes/readepub/epubmeta.py", line 69, in ebook = epub.read_epub(epubname) File "/Users/jaideepadhvaryu/.pyenv/versions/3.9.2/lib/python3.9/site-packages/ebooklib/epub.py", line 1739, in read_epub book = reader.load() File "/Users/jaideepadhvaryu/.pyenv/versions/3.9.2/lib/python3.9/site-packages/ebooklib/epub.py", line 1397, in load self._load() File "/Users/jaideepadhvaryu/.pyenv/versions/3.9.2/lib/python3.9/site-packages/ebooklib/epub.py", line 1688, in _load raise EpubException(0, 'Bad Zip file') ebooklib.epub.EpubException: 'Bad Zip file'

I can understand that there can be problems with epub files.

I am keen to get guidance on how to handle such errors in opening epub files?

Thank you.

aerkalov commented 2 years ago

Hard to say but this is the error you would get if you try to load non epub files. Can you unzip that file? Just rename the extension to .zip and try to extract it or just unzip it from the command line. If you get error than it is not valid ZIP file (Epub files are really ZIP files).

arealhorse commented 2 years ago

I also encountered this error The book I used is here

lb803 commented 1 year ago

I have the feeling this might not be an issue with the epub file.

I started an interactive session and I was able to create a zipfile object of the epub file which proved difficult (I used the same args as ebooklib, just to be sure).

Python 3.10.6 (main, Nov  2 2022, 18:53:38) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import zipfile
>>> f = zipfile.ZipFile("file.epub", 'r', compression=zipfile.ZIP_DEFLATED, allowZip64=True)
>>> f
<zipfile.ZipFile filename='file.epub' mode='r'>

I tried this both inside and outside the virtual enviroment where I have ebooklib installed (just to narrow down possible dependency issues); same result.

Epubs created with epublib are read just great.

Do you guys have any thoughts about this?

lb803 commented 1 year ago

I have the feeling this might not be an issue with the epub file.

I started an interactive session and I was able to create a zipfile object of the epub file which proved difficult (I used the same args as ebooklib, just to be sure).

Python 3.10.6 (main, Nov  2 2022, 18:53:38) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import zipfile
>>> f = zipfile.ZipFile("file.epub", 'r', compression=zipfile.ZIP_DEFLATED, allowZip64=True)
>>> f
<zipfile.ZipFile filename='file.epub' mode='r'>

I tried this both inside and outside the virtual enviroment where I have ebooklib installed (just to narrow down possible dependency issues); same result.

Epubs created with epublib are read just great.

Do you guys have any thoughts about this?

OK, I found the solution to my own issue (hopefully this helps others):

The epub.read_epub() method accepts a file path (string) as argument, whereas I was trying to feed it a file object.

paweltylman commented 1 year ago

I have the feeling this might not be an issue with the epub file. I started an interactive session and I was able to create a zipfile object of the epub file which proved difficult (I used the same args as ebooklib, just to be sure).

Python 3.10.6 (main, Nov  2 2022, 18:53:38) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import zipfile
>>> f = zipfile.ZipFile("file.epub", 'r', compression=zipfile.ZIP_DEFLATED, allowZip64=True)
>>> f
<zipfile.ZipFile filename='file.epub' mode='r'>

I tried this both inside and outside the virtual enviroment where I have ebooklib installed (just to narrow down possible dependency issues); same result. Epubs created with epublib are read just great. Do you guys have any thoughts about this?

OK, I found the solution to my own issue (hopefully this helps others):

The epub.read_epub() method accepts a file path (string) as argument, whereas I was trying to feed it a file object.

Hello, please consider to take a look on this issue, i will be very thankful, maybe you can help.

https://www.reddit.com/r/Calibre/comments/10axudh/epub_to_pdf_pleas_help/

c1924959470 commented 1 year ago

   您好!您的来信我已接受,我会尽快回复您。

boolYikes commented 10 months ago

I have the feeling this might not be an issue with the epub file. I started an interactive session and I was able to create a zipfile object of the epub file which proved difficult (I used the same args as ebooklib, just to be sure).

Python 3.10.6 (main, Nov  2 2022, 18:53:38) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import zipfile
>>> f = zipfile.ZipFile("file.epub", 'r', compression=zipfile.ZIP_DEFLATED, allowZip64=True)
>>> f
<zipfile.ZipFile filename='file.epub' mode='r'>

I tried this both inside and outside the virtual enviroment where I have ebooklib installed (just to narrow down possible dependency issues); same result. Epubs created with epublib are read just great. Do you guys have any thoughts about this?

OK, I found the solution to my own issue (hopefully this helps others):

The epub.read_epub() method accepts a file path (string) as argument, whereas I was trying to feed it a file object.

In my case, the problem was in file paths having escape sequences in them, i.e. \ with spaces. Dodged it by renaming the files temporarily to have underbars and then reverting them back. Edit: Also faulty epub file can cause this. Check with your sources!

kevinsingapore commented 4 months ago

mark, i have met the same issue.

errors below: `/Users/kevin/Library/Python/3.8/lib/python/site-packages/ebooklib/epub.py:1395: UserWarning: In the future version we will turn default option ignore_ncx to True. warnings.warn('In the future version we will turn default option ignore_ncx to True.') Traceback (most recent call last): File "/Users/kevin/Library/Python/3.8/lib/python/site-packages/ebooklib/epub.py", line 1714, in _load self.zf = zipfile.ZipFile(self.file_name, 'r', compression=zipfile.ZIP_DEFLATED, allowZip64=True) File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/zipfile.py", line 1269, in init self._RealGetContents() File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/zipfile.py", line 1336, in _RealGetContents raise BadZipFile("File is not a zip file") zipfile.BadZipFile: File is not a zip file

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "read_epub.py", line 66, in read_epub(epub_path) File "read_epub.py", line 9, in read_epub book = epub.read_epub(epub_path) File "/Users/kevin/Library/Python/3.8/lib/python/site-packages/ebooklib/epub.py", line 1768, in read_epub book = reader.load() File "/Users/kevin/Library/Python/3.8/lib/python/site-packages/ebooklib/epub.py", line 1410, in load self._load() File "/Users/kevin/Library/Python/3.8/lib/python/site-packages/ebooklib/epub.py", line 1716, in _load raise EpubException(0, 'Bad Zip file') ebooklib.epub.EpubException: 'Bad Zip file'`

err.log