Open f82f2f79-bfff-44c7-86e7-c93f7a9fd1fe opened 2 years ago
now that the getaddrinfo lock has been removed on all platforms the numeric only host resolve in asyncio could be moved back into BaseEventLoop.getaddrinfo
Could you provide more context for the proposed change?
hello, it's actually a bit of a round about context, but it was brought up on a tornado issue where I was attempting to port the asyncio optimization to tornado: https://github.com/tornadoweb/tornado/issues/3113#issuecomment-1041019287
I think it would be better to use this AI_NUMERICHOST | AI_NUMERICSERV optimization from trio everywhere instead
To summarize the justification, this patch does two things: it moves an optimization from create_connection to getaddrinfo, which makes it apply to more callers (including Tornado), and it makes the code simpler and less redundant (net reduction of 47 non-test lines in the patch).
As far as we can tell, the reason it wasn't done this way in the first place is that at the time getaddrinfo held a global lock on some platforms, but this is no longer true. If there's still some locking in or around getaddrinfo on some platforms (or some libc implementations), this patch would be a bad idea. Is there a good way to test for that? I suppose we could set up a deliberately-slow DNS server and try to call getaddrinfo with AI_NUMERICHOST while another thread is blocked talking to that server, but that seems like a lot of test infrastructure to build out.
On MacOS in 2015, getaddrinfo was found to be much slower than inet_pton. Unless that's changed, this patch would be a performance regression on that platform. Data and benchmark script in https://groups.google.com/g/python-tulip/c/-SFI8kkQEj4/m/m1-oCMSABgAJ
running the benchmark script https://gist.github.com/graingert/5097c6be997ab2a201b48b858524e163 on my Linux machine I get
before:
/home/graingert/projects/cpython-main/../demo/demo.py:28: DeprecationWarning: There is no current event loop
loop = get_event_loop()
host port family type proto secs
1.2.3.4 1 2 1 6 9.4
1.2.3.4 1 0 1 6 9.7
1.2.3.4 1 0 2 17 10.9
1.2.3.4 1 0 1 0 14.5
1.2.3.4 1 0 2 0 15.9
1.2.3.4 1 0 0 0 15.8
::3 1 10 1 6 13.0
::3 1 0 1 6 13.3
Traceback (most recent call last):
File "/home/graingert/projects/cpython-main/../demo/demo.py", line 58, in <module>
lookup(*info)
^^^^^^^^^^^^^
File "/home/graingert/projects/cpython-main/../demo/demo.py", line 32, in lookup
return loop.run_until_complete(loop.getaddrinfo(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/graingert/projects/cpython-main/Lib/asyncio/base_events.py", line 650, in run_until_complete
return future.result()
^^^^^^^^^^^^^^^
File "/home/graingert/projects/cpython-main/Lib/asyncio/base_events.py", line 864, in getaddrinfo
return await self.run_in_executor(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/graingert/projects/cpython-main/Lib/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/graingert/projects/cpython-main/Lib/socket.py", line 961, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
socket.gaierror: [Errno -2] Name or service not known
after:
/home/graingert/projects/cpython/../demo/demo.py:28: DeprecationWarning: There is no current event loop
loop = get_event_loop()
host port family type proto secs
1.2.3.4 1 2 1 6 2.4
1.2.3.4 1 0 1 6 2.4
1.2.3.4 1 0 2 17 2.4
1.2.3.4 1 0 1 0 2.4
1.2.3.4 1 0 2 0 2.3
1.2.3.4 1 0 0 0 2.7
::3 1 10 1 6 2.3
::3 1 0 1 6 2.3
Traceback (most recent call last):
File "/home/graingert/projects/cpython/../demo/demo.py", line 58, in <module>
lookup(*info)
^^^^^^^^^^^^^
File "/home/graingert/projects/cpython/../demo/demo.py", line 32, in lookup
return loop.run_until_complete(loop.getaddrinfo(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/graingert/projects/cpython/Lib/asyncio/base_events.py", line 591, in run_until_complete
return future.result()
^^^^^^^^^^^^^^^
File "/home/graingert/projects/cpython/Lib/asyncio/base_events.py", line 814, in getaddrinfo
return await self.run_in_executor(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/graingert/projects/cpython/Lib/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/graingert/projects/cpython/Lib/socket.py", line 962, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
socket.gaierror: [Errno -2] Name or service not known
on macos:
before https://github.com/graingert/cpython/runs/6528334512?check_suite_focus=true#step:7:1
Run ./python.exe demo.py
/Users/runner/work/cpython/cpython/demo.py:28: DeprecationWarning: There is no current event loop
loop = get_event_loop()
host port family type proto secs
1.2.3.4 1 2 1 6 54.6
1.2.3.4 1 0 1 6 55.0
1.2.3.4 1 0 2 17 52.8
1.2.3.4 1 0 1 0 52.0
1.2.3.4 1 0 2 0 51.2
1.2.3.4 1 0 0 0 54.7
::3 1 30 1 6 52.6
::3 1 0 1 6 52.6
::3%lo0 1 30 1 6 55.5
after https://github.com/graingert/cpython/runs/6528182360?check_suite_focus=true#step:7:11
Run ./python.exe demo.py
/Users/runner/work/cpython/cpython/demo.py:28: DeprecationWarning: There is no current event loop
loop = get_event_loop()
host port family type proto secs
1.2.3.4 1 2 1 6 21.3
1.2.3.4 1 0 1 6 21.1
1.2.3.4 1 0 2 17 21.1
1.2.3.4 1 0 1 0 21.1
1.2.3.4 1 0 2 0 21.0
1.2.3.4 1 0 0 0 22.1
::3 1 30 1 6 20.9
::3 1 0 1 6 20.9
::3%lo0 1 30 1 6 23.8
benchmarking _ipaddr_info
https://github.com/graingert/cpython/commit/dee834cb3c05c1b57948c961700ce3c5ade102d3 seems to show that it's about the same speed as getaddrinfo with AI_NUMERICHOST:
https://github.com/graingert/cpython/runs/6528895181?check_suite_focus=true
Run ./python.exe demo.py
/Users/runner/work/cpython/cpython/demo.py:28: DeprecationWarning: There is no current event loop
loop = get_event_loop()
host port family type proto secs
1.2.3.4 1 2 1 6 19.2
1.2.3.4 1 0 1 6 19.1
1.2.3.4 1 0 2 17 19.2
1.2.3.4 1 0 1 0 19.0
1.2.3.4 1 0 2 0 18.9
1.2.3.4 1 0 0 0 18.1
::3 1 30 1 6 19.0
::3 1 0 1 6 20.2
::3%lo0 1 30 1 6 19.1
@bdarnell
Next action: Review the existing PR and determine next steps.
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields: ```python assignee = None closed_at = None created_at =
labels = ['type-feature', 'expert-asyncio']
title = 'use AI_NUMERICHOST | AI_NUMERICSERV to skip getaddrinfo thread in asyncio'
updated_at =
user = 'https://github.com/graingert'
```
bugs.python.org fields:
```python
activity =
actor = 'Ben.Darnell'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['asyncio']
creation =
creator = 'graingert'
dependencies = []
files = []
hgrepos = []
issue_num = 46824
keywords = ['patch']
message_count = 5.0
messages = ['413699', '413701', '413702', '415510', '415511']
nosy_count = 4.0
nosy_names = ['asvetlov', 'Ben.Darnell', 'yselivanov', 'graingert']
pr_nums = ['31497']
priority = 'normal'
resolution = None
stage = 'patch review'
status = 'open'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue46824'
versions = []
```