colesbury / nogil

Multithreaded Python without the GIL
Other
2.88k stars 107 forks source link

Using PyInstaller with nogil python #134

Open dzejkob-b opened 5 months ago

dzejkob-b commented 5 months ago

Hello,

I have nogil-python compiled under Windows with an application built in Cython - and everything works fine. But there are problems when I tried to wrap the application using PyInstaller, the following error occurred:

Traceback (most recent call last):
  File "C:\Users\dzejkob\Documents\nogil-python\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\dzejkob\Documents\nogil-python\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\dzejkob\Documents\nogil-python\lib\site-packages\PyInstaller\__main__.py", line 134, in <module>
    run()
  File "C:\Users\dzejkob\Documents\nogil-python\lib\site-packages\PyInstaller\__main__.py", line 108, in run
    parser = generate_parser()
  File "C:\Users\dzejkob\Documents\nogil-python\lib\site-packages\PyInstaller\__main__.py", line 76, in generate_parser
    import PyInstaller.building.build_main
  File "C:\Users\dzejkob\Documents\nogil-python\lib\site-packages\PyInstaller\building\build_main.py", line 27, in <module>
    from PyInstaller.building.api import COLLECT, EXE, MERGE, PYZ
  File "C:\Users\dzejkob\Documents\nogil-python\lib\site-packages\PyInstaller\building\api.py", line 27, in <module>
    from PyInstaller.archive.writers import CArchiveWriter, ZlibArchiveWriter
  File "C:\Users\dzejkob\Documents\nogil-python\lib\site-packages\PyInstaller\archive\writers.py", line 28, in <module>
    from PyInstaller.building.utils import fake_pyc_timestamp, get_code_object, strip_paths_in_code
  File "C:\Users\dzejkob\Documents\nogil-python\lib\site-packages\PyInstaller\building\utils.py", line 31, in <module>
    from PyInstaller.depend.bindepend import match_binding_redirect
  File "C:\Users\dzejkob\Documents\nogil-python\lib\site-packages\PyInstaller\depend\bindepend.py", line 26, in <module>
    from PyInstaller.depend import dylib, utils
  File "C:\Users\dzejkob\Documents\nogil-python\lib\site-packages\PyInstaller\depend\utils.py", line 28, in <module>
    from PyInstaller.depend import bytecode
  File "C:\Users\dzejkob\Documents\nogil-python\lib\site-packages\PyInstaller\depend\bytecode.py", line 88, in <module>
    _call_function_bytecode = bytecode_regex(
  File "C:\Users\dzejkob\Documents\nogil-python\lib\site-packages\PyInstaller\depend\bytecode.py", line 55, in bytecode_regex
    pattern = re.sub(
  File "C:\Users\dzejkob\Documents\nogil-python\lib\re.py", line 210, in sub
    return _compile(pattern, flags).sub(repl, string, count)
  File "C:\Users\dzejkob\Documents\nogil-python\lib\site-packages\PyInstaller\depend\bytecode.py", line 57, in <lambda>
    lambda m: _instruction_to_regex(m[1].decode()),
  File "C:\Users\dzejkob\Documents\nogil-python\lib\site-packages\PyInstaller\depend\bytecode.py", line 38, in _instruction_to_regex
    return re.escape(bytes([dis.opmap[x]]))
KeyError: 'EXTENDED_ARG'

it looks like a problem with nogil-python, because the package is completed fine under pyinstaller with standard cpython. But i tried to make these dumb fixes:

In nogil-python/Lib/site-packages/PyInstaller/depend/bytecode.py added this under imports:

dis.opmap['EXTENDED_ARG'] = 144

Than added bound check in this function:

def load(raw: bytes, code: CodeType) -> str:
    """
    Parse an (extended) LOAD_xxx instruction.
    """
    # Get the enumeration.
    index = extended_arguments(raw)

    # Work out what that enumeration was for (constant/local var/global var).

    # If the last instruction byte is a LOAD_FAST:
    if raw[-2] == dis.opmap["LOAD_FAST"]:
        # Then this is a local variable.
        return code.co_varnames[index]
    # Or if it is a LOAD_CONST:
    if raw[-2] == dis.opmap["LOAD_CONST"]:
        # Then this is a literal.
        return code.co_consts[index]
    # Otherwise, it is a global name.

    # add check
    if index < len(code.co_names):
        return code.co_names[index]
    else:
        return None

    #!!return code.co_names[index]

Than updated this:

def loads(raw: bytes, code: CodeType) -> list:
    """
    Parse multiple consecutive LOAD_xxx instructions. Or load() in a for loop.

    May be used to unpack a function's parameters or nested attributes ``(foo.bar.pop.whack)``.
    """

    # add check
    res = []

    for i in _extended_arg_bytecode.findall(raw):
        load_res = load(i, code);

        if load_res is not None:
            res.append(load_res)

    return res

    #!!return [load(i, code) for i in _extended_arg_bytecode.findall(raw)]

And in function:

def function_calls(code: CodeType) -> list:

added check:

    if function_root is not None: # << add check
        methods = loads(methods, code)
        function = ".".join([function_root] + methods)

and in file nogil-python/Lib/dis.py added bound check in this function:

def _get_const_info(const_index, const_list):
    """Helper to get optional details about const references

       Returns the dereferenced constant and its repr if the constant
       list is defined.
       Otherwise returns the constant index and its repr().
    """
    argval = const_index
    if const_list is not None:
        if const_index < len(const_list): # << add check
            argval = const_list[const_index]
        else:
            return None, None

    return argval, repr(argval)

after these fixes, PyInstaller starts and successfully packages the application into a working binary. However, it is worth mentioning that almost the entire application is written in Cython and everything is compiled in the Cython module. I'm not sure if these fixes will work in a pure Python application - apparently it omits some calls or arguments.

Do you have an idea how to solve this problem properly?

colesbury commented 5 months ago

Do you have an idea how to solve this problem properly?

Solving this properly would require that either PyInstaller can handle the bytecodes in the nogil fork, or that the nogil fork is modified to use CPython 3.9 bytecodes. I don't think either of those things are likely to happen. I don't think that it's worth investing in either this point now that we are focused on CPython 3.13.

colesbury commented 5 months ago

FWIW, the free-threaded (i.e, "nogil) build of 3.13 uses the same bytecodes as the default build, and so won't have this problem. (Assuming that PyInstaller can support CPython 3.13)

dzejkob-b commented 4 months ago

So if I understand correctly, the latest version of cpython 3.13 supports to natively turn off gil? I found this:

https://github.com/python/cpython/issues/108223

pyinstaller have some support: "PyInstaller will also not work with beta releases of Python 3.13.".

Can you please clarify if there are any significant differences between nogil in this fork and python 3.13 build with --disable-gil ? (except, of course, for differences between versions of Python)

colesbury commented 4 months ago

So if I understand correctly, the latest version of cpython 3.13 supports to natively turn off gil?

We are working towards that, but it is not ready yet.

Can you please clarify if there are any significant differences between nogil in this fork and python 3.13 build with --disable-gil ?

They have the same goal (no GIL) but otherwise the implementations are pretty different. A lot of Python internals changed between Python 3.9 and Python 3.13. For example, the Python bytecodes, which look to be the source of the PyInstaller incompatibility are different in Python 3.9, Python 3.13, and this nogil fork. However, they should be the same between the --disable-gil and default builds of CPython 3.13