openzim / python-libzim

Libzim binding for Python: read/write ZIM files in Python
https://pypi.org/project/libzim/
GNU General Public License v3.0
66 stars 23 forks source link

Exception inside contextmanager should cancel the zim creation #42

Open rgaudin opened 4 years ago

rgaudin commented 4 years ago

When using the context-manager, should a (non libzim) error occur, the exception is raised but the finalization is done on the Creator as if everything went well.

with libzim.writer.Creator(
    "test_x07.zim", main_page="A/index.html", index_language="eng", min_chunk_size=2048,
) as zfile:
    zfile.add_article(DumbArticle("index.html", "hello", ARTICLE_MIME, "bonjour"))
    raise Exception("outch")
    zfile.add_article(DumbArticle("page2.html", "hello2", ARTICLE_MIME, "bonjour2"))

T:0; A:4; RA:0; CA:4; UA:0; FA:0; IA:1; C:0; CC:0; UC:0; WC:1
T:0; Waiting for workers
T:0; ResolveRedirectIndexes
Resolve redirect
T:0; Set article indexes
set index
T:0; Resolve mimetype
T:0; create title index
T:0; 6 title index created
T:0; 2 clusters created
T:0; write zimfile :
T:0;  write mimetype list
T:0;  write directory entries
T:0;  write url prt list
T:0;  write title index
T:0;  write cluster offset list
T:0;  write header
T:0;  write checksum
T:0; rename tmpfile to final one.
T:0; finish
Traceback (most recent call last):
  File "./demo.py", line 55, in <module>
    raise Exception("outch")
Exception: outch

This results in a valid ZIM file on the filesystem but lacking the second article of course.

I think the expected behavior would be to cancel the ZIM creation and remove temporary files.

@mgautierfr @kelson42 ?

mgautierfr commented 4 years ago

Yes. However, there is no proper way to cancel the zim creation on libzim side itself.

I've just open a issue on libzim side : openzim/libzim#347

rgaudin commented 3 years ago

@mgautierfr, I tested the thread_exception branch on libzim but I see no difference in how an exception in a ContentProvider is handled"

def test_creator_exception_item(fpath):
    class AContentProvider:
        def get_size(self):
            return 1

        def feed(self):
            raise FileNotFoundError("missing file")

    class AnItemWithCP(StaticItem):
        def get_contentprovider(self):
            return AContentProvider()

    with Creator(fpath) as c:
        with pytest.raises(RuntimeError, match="FileNotFoundError"):
            c.add_item(AnItemWithCP())

It crashes with an output similar to:

Resolve redirect
set index
libc++abi.dylib: terminating with uncaught exception of type std::runtime_error: Traceback (most recent call last):
  File "libzim/wrapper.pyx", line 127, in libzim.wrapper.blob_cy_call_fct
    blob = func()
  File "/Users/reg/src/pylibzim/tests/test_libzim_creator.py", line 132, in feed
    raise FileNotFoundError("missing file")
FileNotFoundError: missing file

Fatal Python error: Aborted

What's different though is that a lot of the tests are now failing with the obscure RuntimeError: Creator is in error state yet still crashing the process.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.