aerkalov / ebooklib

Python E-book library for handling books in EPUB2/EPUB3 format -
https://ebooklib.readthedocs.io/
GNU Affero General Public License v3.0
1.5k stars 233 forks source link

Warning on opening ebook #307

Closed aozgaa closed 7 months ago

aozgaa commented 7 months ago

Given this sample code:

from ebooklib import epub
epub_file = "somefile.epub"
book = epub.read_epub(epub_file, {"ignore_ncx": True})

I get these warnings:

c:\Users\...\venv\lib\site-packages\ebooklib\epub.py:1423: FutureWarning: This search incorrectly ignores the root element, and will be fixed in a future version.  If you rely on the current behaviour, change it to './/xmlns:rootfile[@media-type]'
  for root_file in tree.findall('//xmlns:rootfile[@media-type]', namespaces={'xmlns': NAMESPACES['CONTAINERNS']}):

And here is a stack trace:

  File "c:\Users\...\myscript.py", line 9, in extract_paragraphs
    book = epub.read_epub(epub_file, {"ignore_ncx": True})
  File "c:\Users\...\venv\lib\site-packages\ebooklib\epub.py", line 1768, in read_epub
    book = reader.load()
  File "c:\Users\...\venv\lib\site-packages\ebooklib\epub.py", line 1410, in load
    self._load()
  File "c:\Users\...\venv\lib\site-packages\ebooklib\epub.py", line 1721, in _load
    self._load_container()
  File "c:\Users\...\venv\lib\site-packages\ebooklib\epub.py", line 1423, in _load_container
    for root_file in tree.findall('//xmlns:rootfile[@media-type]', namespaces={'xmlns': NAMESPACES['CONTAINERNS']}):

As the warning indicates, this error can be fixed by making the actual behavior explicit by modifying epub.py:1423 to add a . at the beginning of the xpath. But I am not sure if this behavior is intended.

Platform: Windows 10 Python version : '3.10.11 (tags/v3.10.11:7d4cc5a, Apr 5 2023, 00:38:17) [MSC v.1929 64 bit (AMD64)]'

aerkalov commented 7 months ago

Thanks! I merged the PR where this is fixed.

lyz-code commented 3 months ago

Hi @aerkalov can you please make a release on Pypi with the fix?