python / cpython

The Python programming language
https://www.python.org
Other
63.49k stars 30.41k forks source link

Import deadlock detection causes deadlock #82272

Open 9a4854ba-a0a7-4291-945b-b0004bf07198 opened 5 years ago

9a4854ba-a0a7-4291-945b-b0004bf07198 commented 5 years ago
BPO 38091
Nosy @brettcannon, @pitrou, @ericsnowcurrently, @rlamy, @phmc, @miss-islington, @vzhestkov
PRs
  • python/cpython#17518
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = None closed_at = None created_at = labels = ['3.7', '3.8', 'type-bug', 'library', '3.9'] title = 'Import deadlock detection causes deadlock' updated_at = user = 'https://github.com/rlamy' ``` bugs.python.org fields: ```python activity = actor = 'vzhestkov' assignee = 'none' closed = False closed_date = None closer = None components = ['Library (Lib)'] creation = creator = 'Ronan.Lamy' dependencies = [] files = [] hgrepos = [] issue_num = 38091 keywords = ['patch'] message_count = 3.0 messages = ['351647', '363232', '397884'] nosy_count = 7.0 nosy_names = ['brett.cannon', 'pitrou', 'eric.snow', 'Ronan.Lamy', 'pconnell', 'miss-islington', 'vzhestkov'] pr_nums = ['17518'] priority = 'normal' resolution = None stage = 'patch review' status = 'open' superseder = None type = 'behavior' url = 'https://bugs.python.org/issue38091' versions = ['Python 3.6', 'Python 3.7', 'Python 3.8', 'Python 3.9'] ```

    9a4854ba-a0a7-4291-945b-b0004bf07198 commented 5 years ago

    There seems to be a race condition in importlib._bootstrap._ModuleLock that can cause a deadlock. The sequence of operations is as follows:

    The issue was found in pypy3 but it also affects all the recent CPython versions I tried. I can reliably reproduce the issue by adding an artificial delay to _ModuleLock.has_deadlock(), e.g. with this patch:

    diff --git a/Lib/test/test_import/__init__.py b/Lib/test/test_import/__init__.py
    index f167c84..7f7188e 100644
    --- a/Lib/test/test_import/__init__.py
    +++ b/Lib/test/test_import/__init__.py
    @@ -435,10 +435,15 @@ class ImportTests(unittest.TestCase):
                     os.does_not_exist
    
         def test_concurrency(self):
    +        def delay_has_deadlock(frame, event, arg):
    +            if event == 'call' and frame.f_code.co_name == 'has_deadlock':
    +                time.sleep(0.2)
    +
             sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'data'))
             try:
                 exc = None
                 def run():
    +                sys.settrace(delay_has_deadlock)
                     event.wait()
                     try:
                         import package
    miss-islington commented 4 years ago

    New changeset 6daa37fd42c5d5300172728e8b4de74fe0b319fc by Armin Rigo in branch 'master': bpo-38091: Import deadlock detection causes deadlock (GH-17518) https://github.com/python/cpython/commit/6daa37fd42c5d5300172728e8b4de74fe0b319fc

    90c69ba2-e16c-4e00-b29e-36bf0e14e605 commented 3 years ago

    I ported the fix from https://github.com/python/cpython/commit/6daa37fd42c5d5300172728e8b4de74fe0b319fc for 3.6 and 3.8 shipped with SLE 15SP2 and openSUSE Tumbleweed, but it seems that this fix doesn't help. I have a deadlocks on running salt-api process managing salt-ssh systems with high workload. The service can get the deadlock in first 5 minutes or after 3-60 minutes of running the service with the same workload with almost equal chances.

    Here is the part of py-bt I see each time:

    (gdb) py-bt
    Traceback (most recent call first):
      File "<frozen importlib._bootstrap>", line 107, in acquire
      File "<frozen importlib._bootstrap>", line 158, in __enter__
      File "<frozen importlib._bootstrap>", line 595, in _exec
      File "<frozen importlib._bootstrap>", line 271, in _load_module_shim
      File "<frozen importlib._bootstrap_external>", line 852, in load_module
      File "<frozen importlib._bootstrap_external>", line 1027, in load_module
      File "<frozen importlib._bootstrap_external>", line 1034, in _check_name_wrapper
      File "/usr/lib/python3.8/site-packages/salt/loader.py", line 4779, in _load_module
      File "/usr/lib/python3.8/site-packages/salt/loader.py", line 1926, in _inner_load
        if self._load_module(name) and key in self._dict:
      File "/usr/lib/python3.8/site-packages/salt/loader.py", line 2193, in _load
      File "/usr/lib/python3.8/site-packages/salt/utils/lazy.py", line 99, in __getitem__
        if self._load(key):
      File "/usr/lib/python3.8/site-packages/salt/loader.py", line 1283, in __getitem__
        func = super().__getitem__(item)
      File "/usr/lib/python3.8/site-packages/salt/loader.py", line 1139, in __getitem__
        return self._dict[key + self.suffix]
      File "/usr/lib/python3.8/site-packages/salt/template.py", line 495, in check_render_pipe_str
      File "/usr/lib/python3.8/site-packages/salt/loader.py", line 1428, in render
        f_noext,
      File "/usr/lib/python3.8/site-packages/salt/pillar/__init__.py", line 781, in __init__
    ...
    encukou commented 2 years ago

    I just saw a buildbot failure that might be related to the test added in #17518:

    test_concurrency (test.test_import.ImportTests.test_concurrency) ... Warning -- Uncaught thread exception: RuntimeError
    Exception in thread Thread-3 (run):
    Traceback (most recent call last):
      File "/home/dje/cpython-buildarea/3.x.edelsohn-sles-z/build/Lib/threading.py", line 1036, in _bootstrap_inner
        self.run()
      File "/home/dje/cpython-buildarea/3.x.edelsohn-sles-z/build/Lib/threading.py", line 973, in run
        self._target(*self._args, **self._kwargs)
      File "/home/dje/cpython-buildarea/3.x.edelsohn-sles-z/build/Lib/test/test_import/__init__.py", line 471, in run
        sys.settrace(None)
    RuntimeError: Cannot install a trace function while another trace function is being installed
    ok