python / asyncio

asyncio historical repository
https://docs.python.org/3/library/asyncio.html
1.04k stars 177 forks source link

socket.getservbyname() in _ipaddr_info could block #422

Closed dwfreed closed 8 years ago

dwfreed commented 8 years ago

354 added a call to socket.getservbyname() in _ipaddr_info() to resolve service names into port numbers. However, local system configuration could result in getservbyname() blocking for some time, especially if the system is configured to query something like NIS or use the old hesiod protocol to look up this information. Even in the default configuration, if the disk holding /etc/services is unhappy with life, attempting to read /etc/services could take some time. This would bring the whole event loop to a grinding halt. getaddrinfo() (usually) will handle non-numeric port strings so long as AI_NUMERICSERV is not used, so it would probably be best to just let getaddrinfo() handle it.

(While looking into this, I discovered the Python getservbyname() implementation is not thread-safe, but that's a different bug...)

gvanrossum commented 8 years ago

@ajdavis Any thoughts on this? Didn't this code change a few times? Personally I think getservbyname() is a silly API and apps should just pass integers around, and I wish PEP 3156 had specified that. For apps that pass ints there's no worry, since getservbyname() is only called when int(port) fails. Finally, how do we know that getaddrinfo() doesn't have the same blocking behavior?

dwfreed commented 8 years ago

@gvanrossum getaddrinfo() is run in an executor, so whether it blocks or not, the event loop isn't affected.

gvanrossum commented 8 years ago

Good point. I see no problem with deferring to getaddrinfo() if the port is not numeric.

--Guido (mobile)

ajdavis commented 8 years ago

This discussion is getting a little circular. =) Here's more background than you asked for:

Yes, "getaddrinfo() is run in an executor, so whether it blocks or not, the event loop isn't affected." That's true, but until CPython issues 25924 and 26406 were fixed, getaddrinfo calls competed for a special getaddrinfo lock on Mac OS or BSD, so they could block each other. Hence, we've been trying to avoid using getaddrinfo to check or transform already-resolved hosts in asyncio. If the user tells asyncio to connect to "127.0.0.1", we shouldn't block on the getaddrinfo lock waiting to check whether that's a numeric IP or not, we should check some other way. That work was originally in https://github.com/python/asyncio/pull/302 with some updates in https://github.com/python/asyncio/pull/349, https://github.com/python/asyncio/pull/354, and https://github.com/python/asyncio/pull/357.

I've argued recently we should revert all this now that the latest Pythons have removed the getaddrinfo lock, but we didn't resolve that argument.

Ending the background recap and focusing on the discussion at hand:

I can see a simple change in _ensure_resolved in base_events.py that would, as you say, use getaddrinfo if port is not numeric.

ajdavis commented 8 years ago

Justification, for future readers' sake: