markokr / rarfile

Python module for RAR archive reading
https://pypi.org/project/rarfile/
ISC License
246 stars 55 forks source link

Documentation/Implementation: UNRAR_TOOL handling broken in certain situations #47

Closed pannal closed 5 years ago

pannal commented 6 years ago

Hey,

I've discovered a nasty bug/documentation issue with the UNRAR_TOOL variable.

Note: As this is a library, the user of the library is expected to import rarfile and set UNRAR_TOOL afterwards, before using it, instead of manipulating the actual rarfile.py itself.

rarfile._check_unrar_tool is called explicitly at the end of the module and consists of:

def _check_unrar_tool():
    global UNRAR_TOOL, OPEN_ARGS, EXTRACT_ARGS, TEST_ARGS
    try:
        # does UNRAR_TOOL work?
        custom_check([ORIG_UNRAR_TOOL], True)

        UNRAR_TOOL = ORIG_UNRAR_TOOL
        OPEN_ARGS = ORIG_OPEN_ARGS
        EXTRACT_ARGS = ORIG_EXTRACT_ARGS
        TEST_ARGS = ORIG_TEST_ARGS
    except RarCannotExec:
        try:
            # does ALT_TOOL work?
            custom_check([ALT_TOOL] + list(ALT_CHECK_ARGS), True)
            # replace config
            UNRAR_TOOL = ALT_TOOL
            OPEN_ARGS = ALT_OPEN_ARGS
            EXTRACT_ARGS = ALT_EXTRACT_ARGS
            TEST_ARGS = ALT_TEST_ARGS
        except RarCannotExec:
            # no usable tool, only uncompressed archives work
            return False
    return True

When UNRAR_TOOL is set specifically, that is done AFTER this function has run at least once, because of the necessary rarfile import.

When the initial check for unrar didn't work, all the variables are set to the ALT_-ones, if bsdtar was found.

Thus, after setting rarfile.UNRAR_TOOL, rarfile tries to execute that executable using the ALT_ arguments meant for bsdtar.

So in order for UNRAR_TOOL to work properly, one has to call rarfile._check_unrar_tool() again, after setting UNRAR_TOOL and ORIG_UNRAR_TOOL.

The ALT fallback is the culprit, as it always overrides the necessary parameters for unrar, if bsdtar is available, so calling _check_unrar_tool() after setting UNRAR_TOOL isn't solving the problem, if UNRAR_TOOL didn't work in the first place. And it's not exposed as a public function anyways.

I think instead custom_check should be used after setting UNRAR_TOOL, and then all args need to be reset to their corresponding ORIG_-value, if its return value is True (by the library user).

Another option would be for _check_unrar_tool() to only set UNRAR_TOOL to ORIG_UNRAR_TOOL, if UNRAR_TOOL was not modified and doesn't differ from ORIG_UNRAR_TOOL.

Once UNRAR_TOOL is modified, the user does most likely not want to fall back to bsdtar.

This is largely a documentation issue to be honest.

pannal commented 6 years ago

This happened on OSX, when unrar wasn't executable by the app that uses rarfile, because of OSX's SIP, but bsdtar was.

Andrew-Morozko commented 6 years ago

Just independently came to the same conclusion. For me unrar was installed in /usr/local/bin/, which in some conditions (like in AppleScript's do shell script) wasn't inserted into $PATH.

My Analysis: 1) The root cause of this problem is probably the fact that it is possible to change UNRAR_TOOL and not change other three dependent configurations. If NORMAL_CONFIG and ALT_CONFIG were, for example, namedtuples, that got assigned to global CURRENT_CONFIG this bug would be impossible. 2) Contributing cause is that changing UNRAR_TOOL from user-land is actually promoted as a normal way of interacting with the library. This is like doing surgery, and this bug just showed that UNRAR_TOOL was invisibly connected to the other config parts that start malfunctioning if we poke UNRAR_TOOL.

My contribution: 1) I sadly haven't got enough time to properly centralize all configuration, and it would be quite moronic of me to try to change the whole configuration system, after one poke at it lead to this bug. 2) This is quite easy to fix, but it's only a stop-gap measure: remove UNRAR_TOOL from public API/documentation and add a function to correctly (I hope! I just print-debugged 3k lines of unknown code, so double-check me) modify it, like this:

def set_unrar_path(new_unrar_path):
    """If new_unrar_path passes test - changes UNRAR_TOOL config"""
    global UNRAR_TOOL, OPEN_ARGS, EXTRACT_ARGS, TEST_ARGS

    custom_check([new_unrar_path])
    UNRAR_TOOL = new_unrar_path
    OPEN_ARGS = ORIG_OPEN_ARGS
    EXTRACT_ARGS = ORIG_EXTRACT_ARGS
    TEST_ARGS = ORIG_TEST_ARGS

But once again, in the future it's inevitable, that something somewhere will change one piece of configuration without updating the rest. Nevertheless, set_unrar_path is the monkypatch for this bug that I would use for now.

Thanks for the library, it's quite useful!

P.S.: I'm not blaming or judging anyone, English is not my first language, so assume no malice.

markokr commented 5 years ago

I reworked the tool handling, should be more robust now.

Also removed the module options from doc, it should not be common case that they are used.

gene1wood commented 3 years ago

This was reworked in bd6fbfc96dac1f66854e6807b7cfd4b5825f3ad6