Open 9a4854ba-a0a7-4291-945b-b0004bf07198 opened 5 years ago
There seems to be a race condition in importlib._bootstrap._ModuleLock that can cause a deadlock. The sequence of operations is as follows:
The issue was found in pypy3 but it also affects all the recent CPython versions I tried. I can reliably reproduce the issue by adding an artificial delay to _ModuleLock.has_deadlock(), e.g. with this patch:
diff --git a/Lib/test/test_import/__init__.py b/Lib/test/test_import/__init__.py
index f167c84..7f7188e 100644
--- a/Lib/test/test_import/__init__.py
+++ b/Lib/test/test_import/__init__.py
@@ -435,10 +435,15 @@ class ImportTests(unittest.TestCase):
os.does_not_exist
def test_concurrency(self):
+ def delay_has_deadlock(frame, event, arg):
+ if event == 'call' and frame.f_code.co_name == 'has_deadlock':
+ time.sleep(0.2)
+
sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'data'))
try:
exc = None
def run():
+ sys.settrace(delay_has_deadlock)
event.wait()
try:
import package
New changeset 6daa37fd42c5d5300172728e8b4de74fe0b319fc by Armin Rigo in branch 'master': bpo-38091: Import deadlock detection causes deadlock (GH-17518) https://github.com/python/cpython/commit/6daa37fd42c5d5300172728e8b4de74fe0b319fc
I ported the fix from https://github.com/python/cpython/commit/6daa37fd42c5d5300172728e8b4de74fe0b319fc for 3.6 and 3.8 shipped with SLE 15SP2 and openSUSE Tumbleweed, but it seems that this fix doesn't help.
I have a deadlocks on running salt-api
process managing salt-ssh
systems with high workload. The service can get the deadlock in first 5 minutes or after 3-60 minutes of running the service with the same workload with almost equal chances.
Here is the part of py-bt I see each time:
(gdb) py-bt
Traceback (most recent call first):
File "<frozen importlib._bootstrap>", line 107, in acquire
File "<frozen importlib._bootstrap>", line 158, in __enter__
File "<frozen importlib._bootstrap>", line 595, in _exec
File "<frozen importlib._bootstrap>", line 271, in _load_module_shim
File "<frozen importlib._bootstrap_external>", line 852, in load_module
File "<frozen importlib._bootstrap_external>", line 1027, in load_module
File "<frozen importlib._bootstrap_external>", line 1034, in _check_name_wrapper
File "/usr/lib/python3.8/site-packages/salt/loader.py", line 4779, in _load_module
File "/usr/lib/python3.8/site-packages/salt/loader.py", line 1926, in _inner_load
if self._load_module(name) and key in self._dict:
File "/usr/lib/python3.8/site-packages/salt/loader.py", line 2193, in _load
File "/usr/lib/python3.8/site-packages/salt/utils/lazy.py", line 99, in __getitem__
if self._load(key):
File "/usr/lib/python3.8/site-packages/salt/loader.py", line 1283, in __getitem__
func = super().__getitem__(item)
File "/usr/lib/python3.8/site-packages/salt/loader.py", line 1139, in __getitem__
return self._dict[key + self.suffix]
File "/usr/lib/python3.8/site-packages/salt/template.py", line 495, in check_render_pipe_str
File "/usr/lib/python3.8/site-packages/salt/loader.py", line 1428, in render
f_noext,
File "/usr/lib/python3.8/site-packages/salt/pillar/__init__.py", line 781, in __init__
...
I just saw a buildbot failure that might be related to the test added in #17518:
test_concurrency (test.test_import.ImportTests.test_concurrency) ... Warning -- Uncaught thread exception: RuntimeError
Exception in thread Thread-3 (run):
Traceback (most recent call last):
File "/home/dje/cpython-buildarea/3.x.edelsohn-sles-z/build/Lib/threading.py", line 1036, in _bootstrap_inner
self.run()
File "/home/dje/cpython-buildarea/3.x.edelsohn-sles-z/build/Lib/threading.py", line 973, in run
self._target(*self._args, **self._kwargs)
File "/home/dje/cpython-buildarea/3.x.edelsohn-sles-z/build/Lib/test/test_import/__init__.py", line 471, in run
sys.settrace(None)
RuntimeError: Cannot install a trace function while another trace function is being installed
ok
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields: ```python assignee = None closed_at = None created_at =
labels = ['3.7', '3.8', 'type-bug', 'library', '3.9']
title = 'Import deadlock detection causes deadlock'
updated_at =
user = 'https://github.com/rlamy'
```
bugs.python.org fields:
```python
activity =
actor = 'vzhestkov'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)']
creation =
creator = 'Ronan.Lamy'
dependencies = []
files = []
hgrepos = []
issue_num = 38091
keywords = ['patch']
message_count = 3.0
messages = ['351647', '363232', '397884']
nosy_count = 7.0
nosy_names = ['brett.cannon', 'pitrou', 'eric.snow', 'Ronan.Lamy', 'pconnell', 'miss-islington', 'vzhestkov']
pr_nums = ['17518']
priority = 'normal'
resolution = None
stage = 'patch review'
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue38091'
versions = ['Python 3.6', 'Python 3.7', 'Python 3.8', 'Python 3.9']
```