python / mypy

Optional static typing for Python
https://www.mypy-lang.org/
Other
18.28k stars 2.8k forks source link

Stubgen fails with timeout for locust #9103

Open ska-kialo opened 4 years ago

ska-kialo commented 4 years ago

Hi, I'm trying to generate type stubs for Locust and stubgen fails with a timeout (see details below). I was wondering if someone has any insight into what causes this or how to fix it.

Please provide more information to help us understand the issue:

hoefling commented 4 years ago

This looks like an issue with mypy.moduleinspect.ModuleInspect. To reproduce:

$ python -c "from mypy.moduleinspect import ModuleInspect; m = ModuleInspect(); m.get_package_properties('locust')"
  File "<string>", line 1, in <module>
  File "/Users/hoefling/projects/private/locust/.direnv/python-3.8.5/lib/python3.8/site-packages/mypy/moduleinspect.py", line 137, in get_package_properties
    res = self._get_from_queue()
  File "/Users/hoefling/projects/private/locust/.direnv/python-3.8.5/lib/python3.8/site-packages/mypy/moduleinspect.py", line 163, in _get_from_queue
    raise RuntimeError('Timeout waiting for subprocess')
RuntimeError: Timeout waiting for subprocess

Workaround for stubgen: avoid using the -p flag, e.g.

$ python -c "import locust; print(locust.__path__[0])" | xargs stubgen
Processed 50 modules
Generated files under out/locust/
Avasam commented 2 years ago

Same with PyInstaller.

PS C:\Users\Avasam\Documents\Git\typeshed> stubgen -o stubs\pyinstaller -p PyInstaller
Traceback (most recent call last):
  File "C:\Program Files\Python39\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Program Files\Python39\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Program Files\Python39\Scripts\stubgen.exe\__main__.py", line 7, in <module>
  File "mypy\stubgen.py", line 1715, in main
  File "mypy\stubgen.py", line 1573, in generate_stubs
  File "mypy\stubgen.py", line 1334, in collect_build_targets
  File "mypy\stubgen.py", line 1378, in find_module_paths_using_imports
  File "C:\Program Files\Python39\lib\site-packages\mypy\stubutil.py", line 149, in find_module_path_and_all_py3
    mod = inspect.get_package_properties(module)
  File "mypy\moduleinspect.py", line 138, in get_package_properties
  File "mypy\moduleinspect.py", line 164, in _get_from_queue
RuntimeError: Timeout waiting for subprocess
ben9923 commented 2 years ago

I think there are 2 separate issue here.

Locust issue (caused by gevent+multiprocessing)

Locust (which might already have 'inline' typing so stubs for it are not needed) is using gevent.monkey.patch_all() when importing it. One of the things it's patching is the threading module, which is apparently known to be incompatible in some way with multiprocessing (which is used by moduleinspect to send jobs to worker)... I figured it while trying to find where exactly it hangs, and traced it all the way to the fact the threading module is causing the program to completely hang while waiting for the thread to start (while it does actually start - Event is set but wait is failing).

# threading.Thread.start():
...
self._started.wait()

This thread is an internal multiprocessing.Queue thread used to push data through pipes. Even when messing with the threading implementation to timeout after a certain time, it will not return... I did not check what gevent is doing/patching exactly, so some of the changes I tried applying to the threading module might've had no effect at all.

Possible Solutions/Workarounds:

  1. Commenting out this line in the top of locust.__init__.py works around this problem:
    monkey.patch_all()

    Not sure if locust is ever being imported for this to be considered as a maybe-unwanted-side-effect that should be reported...

  2. Swapping the moduleinspect worker implementation to something that's not using multiprocessing. Not sure if it's even worth it (or if this is 100% the root cause), anyway...

PyInstaller issue (caused by looong imports)

It simply appears that some of the modules in PyInstaller take a really long time to be imported, namely Django hooking-related ones. Timeout for those packages was increased in #13109, I believe enough for those PyInstaller modules to not break stubgen.

@Avasam To get a better idea which module caused it add -v when invoking stubgen, the last module that was listed should be the culprit. You can then manually import it in a Python interpreter to confirm. As some modules in PyInstaller have non-conventional names, import then like that:

import importlib
importlib.import_module(<module name>)

# i.e.
importlib.import_module('PyInstaller.hooks.hook-django')

Next release of mypy (0.980) should have the increased timeout. You can give it a try by installing the last built development wheels of the master branch:

pip install --upgrade --find-links https://github.com/mypyc/mypy_mypyc-wheels/releases/latest mypy
Avasam commented 2 years ago

@ben9923 Thanks for the explanation. PyInstaller still fails on:

[...]
Trying to import 'PyInstaller.hooks.hook-django' for runtime introspection
689 WARNING: Failed to collect submodules for 'django.contrib.postgres.forms' because importing 'django.contrib.postgres.forms' raised: ModuleNotFoundError: No module named 'psycopg2'
8815 INFO: Determining a mapping of distributions to packages...
Traceback (most recent call last):
  File "C:\Program Files\Python39\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Program Files\Python39\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Program Files\Python39\Scripts\stubgen.exe\__main__.py", line 7, in <module>
  File "mypy\stubgen.py", line 1831, in main
  File "mypy\stubgen.py", line 1652, in generate_stubs
  File "mypy\stubgen.py", line 1418, in collect_build_targets
  File "mypy\stubgen.py", line 1452, in find_module_paths_using_imports
  File "C:\Program Files\Python39\lib\site-packages\mypy\stubutil.py", line 83, in find_module_path_and_all_py3
    mod = inspect.get_package_properties(module)
  File "mypy\moduleinspect.py", line 138, in get_package_properties
  File "mypy\moduleinspect.py", line 164, in _get_from_queue
RuntimeError: Timeout waiting for subprocess

importlib.import_module('PyInstaller.hooks.hook-django') fails after ~66s with:

714 WARNING: Failed to collect submodules for 'django.contrib.postgres.forms' because importing 'django.contrib.postgres.forms' raised: ModuleNotFoundError: No module named 'psycopg2'
8282 INFO: Determining a mapping of distributions to packages...
65576 INFO: Packages required by django:
['asgiref', 'sqlparse', 'tzdata']
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Program Files\Python39\lib\importlib\__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 850, in exec_module
  File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
  File "C:\Program Files\Python39\lib\site-packages\PyInstaller\hooks\hook-django.py", line 27, in <module>
    root_dir = django.django_find_root_dir()
  File "C:\Program Files\Python39\lib\site-packages\PyInstaller\utils\hooks\django.py", line 54, in django_find_root_dir
    manage_py = CONF['main_script']
KeyError: 'main_script'

(even if I fix that error it still finishes in that amount of time)

Could stubgen just skip failing imports with a warning instead of blocking everything and/or does max_iter needs to be bumped up more?

ben9923 commented 2 years ago

I think it does warn on import errors and proceeds with stub generation. IIRC I got that error too (after less than 30s) and it kept going. There's no timeout logic for imports though, so I guess that could be added to the worker implementation.

66s is a lot, not sure if increasing the timeout to more than the 0.980 30s would be cool for UX in case of forever-hanging imports. @hauntsaninja any thoughts on that? šŸ˜„

Just because it's so much time - you could further trace the imports in that Django module, to find a more specific cause for it taking so long. Could be something to open PyInstaller a PR for, it's a terrible side-effect for a plain import... That said, no idea if someone should ever directly import it šŸ¤·ā€ā™‚ļø

Avasam commented 2 years ago

That said, no idea if someone should ever directly import it šŸ¤·ā€ā™‚ļø

Definitely not. It's not part of their public API, and importlib errors out because a global config dictionary hasn't been initialized properly and is missing a key.

66s is a lot. In any case, it's not blocking me from doing the work I wanted, as I just ended up using Pyright's stub generation instead (a bit less complete, but I manually filled in the rest). So I mainly wanted to raise this as a potential issue (if it happened for this lib, it could happen for other).

IIRC I got that error too (after less than 30s) [...] you could further trace [...]

I tried on a different machine and it went much faster. I'm fairly certain it depends on how many packages you got installed. As it tries to scan the entirety of sys.path for certain packages. (https://github.com/pyinstaller/pyinstaller/blob/develop/PyInstaller/utils/hooks/__init__.py#L1168)

ben9923 commented 2 years ago

Another solution could be adding a CLI flag (or an env var to minimize code changes?) for stubgen that controls the timeout, with a default of 30s :)

Avasam commented 1 year ago

I'll note that this performance issue shouldn't be a problem anymore with PyInstaller anyway thanks to https://github.com/pyinstaller/pyinstaller/pull/7943 But the OP about locust still stand