aerkalov / ebooklib

Python E-book library for handling books in EPUB2/EPUB3 format -
https://ebooklib.readthedocs.io/
GNU Affero General Public License v3.0
1.48k stars 231 forks source link

I want to know whether there is cache generation in this process #269

Open Gaoyongxian666 opened 1 year ago

Gaoyongxian666 commented 1 year ago

I don't want my computer to save cache files without my knowledge, which takes up a lot of computer disk space. So, does this process generate files on the computer?

def process(epub_path):
    book = epub.read_epub(epub_path)
    txt = ""
    for item in book.get_items():
        spe = "\n------------------------------------------\n\n"
        if item.get_type() == ebooklib.ITEM_DOCUMENT:
            txt = "".join(
                (txt, 'NAME : ' + item.get_name(), spe, html2text.html2text(item.get_content().decode("utf8")), "\n"))

    return txt
aerkalov commented 1 year ago

Good question and I don't know the answer. We use standard Python zipfile library for readinf EPUB files. Looking at the latest implementation of that library I wouldn't say it does, but I wouldn't put my hand in the fire for it and quick google searches have no clear answer how it handles big zip files. You can check it yourself -https://github.com/python/cpython/blob/3.11/Lib/zipfile.py