p0n1 / epub_to_audiobook

EPUB to audiobook converter, optimized for Audiobookshelf
MIT License
888 stars 86 forks source link

KeyError: "There is no item named 'page_styles.css' in the archive" #26

Open jkcchan opened 7 months ago

jkcchan commented 7 months ago

Hi, this tool is very useful, thanks for working on this!

I've encountered a bug with an epub that I'm putting in. Is it a case of a malformed epub?

Thanks

Stack trace:

Traceback (most recent call last):
  File "C:\Users\USER\Documents\Projects\epub_to_audiobook\main.py", line 102, in <module>
    main()
  File "C:\Users\USER\Documents\Projects\epub_to_audiobook\main.py", line 98, in main
    AudiobookGenerator(config).run()
  File "C:\Users\USER\Documents\Projects\epub_to_audiobook\audiobook_generator\core\audiobook_generator.py", line 37, in run
    book_parser = get_book_parser(self.config)
  File "C:\Users\USER\Documents\Projects\epub_to_audiobook\audiobook_generator\book_parsers\base_book_parser.py", line 42, in get_book_parser
    return EpubBookParser(config)
  File "C:\Users\USER\Documents\Projects\epub_to_audiobook\audiobook_generator\book_parsers\epub_book_parser.py", line 19, in __init__
    self.book = epub.read_epub(self.config.input_file)
  File "C:\Users\USER\Documents\Projects\epub_to_audiobook\venv\lib\site-packages\ebooklib\epub.py", line 1768, in read_epub
    book = reader.load()
  File "C:\Users\USER\Documents\Projects\epub_to_audiobook\venv\lib\site-packages\ebooklib\epub.py", line 1410, in load
    self._load()
  File "C:\Users\USER\Documents\Projects\epub_to_audiobook\venv\lib\site-packages\ebooklib\epub.py", line 1722, in _load
    self._load_opf_file()
  File "C:\Users\USER\Documents\Projects\epub_to_audiobook\venv\lib\site-packages\ebooklib\epub.py", line 1679, in _load_opf_file
    self._load_manifest()
  File "C:\Users\USER\Documents\Projects\epub_to_audiobook\venv\lib\site-packages\ebooklib\epub.py", line 1555, in _load_manifest
    ei.content = self.read_file(zip_path.join(self.opf_dir, ei.get_name()))
  File "C:\Users\USER\Documents\Projects\epub_to_audiobook\venv\lib\site-packages\ebooklib\epub.py", line 1417, in read_file
    return self.zf.read(name)
  File "C:\Users\USER\AppData\Local\Programs\Python\Python310\lib\zipfile.py", line 1475, in read
    with self.open(name, "r", pwd) as fp:
  File "C:\Users\USER\AppData\Local\Programs\Python\Python310\lib\zipfile.py", line 1514, in open
    zinfo = self.getinfo(name)
  File "C:\Users\USER\AppData\Local\Programs\Python\Python310\lib\zipfile.py", line 1441, in getinfo
    raise KeyError(
KeyError: "There is no item named 'page_styles.css' in the archive"
p0n1 commented 6 months ago

Thanks for reporting @jkcchan. It could also be a compatibility issue with the EPUB format. I'm interested in testing and fixing it if you can share an example of the problematic EPUB file.

jkcchan commented 6 months ago

Here is an example of a EPUB file that doesn't work: https://drive.google.com/file/d/1r6ODETKs4znHCrT9ZVZe93I4HWy8SJIX/view?usp=drive_link

p0n1 commented 6 months ago

Thanks. Will check it soon.

p0n1 commented 6 months ago

Hi @jkcchan I took a look into your file.

Your epub file claims to have a page_styles.css, but in reality it doesn't.

image

So, when ebooklib parses it, the file reading fails.

Related code in ebooklib:

https://github.com/aerkalov/ebooklib/blob/1cb3d2c251f82c4702c2aff0ed7aea375babf251/ebooklib/epub.py#L1557

Is your book exported from Apple Books? A friend of mine told me that the ebook file exported from Apple Books has formatting issues.

Not sure if this is a common case for epub books. If yes, I might need to find a way to support this.