aio-libs / aiohttp

Asynchronous HTTP client/server framework for asyncio and Python
https://docs.aiohttp.org
Other
15.12k stars 2.02k forks source link

Regression: Windows cannot connect to localhost without network #5357

Open pwuertz opened 3 years ago

pwuertz commented 3 years ago

On a Windows systems without network connection, aiohttp fails to connect to localhost services. This problem is present in aiohttp 3.7, not in aiohttp 3.6.

I think I traced the problem to 9bce9c2d462666ad83111c0966358c2e274bc89b, which added the flag AI_ADDRCONFIG to the getaddrinfo call.

On Windows10, Python 3.8, no network connection, this will throw:

import socket
socket.getaddrinfo('localhost', 1234, 0, socket.SOCK_STREAM, 0, socket.AI_ADDRCONFIG)

I can also reproduce this fail in a windows docker container started with --network none.

The documentation for this flag says:

If the AI_ADDRCONFIG bit is set, getaddrinfo will resolve only if a global address is configured.
If AI_ADDRCONFIG flag is specified,
IPv4 addresses shall be returned only if an IPv4 address is configured on the local system, and
IPv6 addresses shall be returned only if an IPv6 address is configured on the local system.
The IPv4 or IPv6 loopback address is not considered a valid global address.

So if localhost is your only IP address, AI_ADDRCONFIG is expected to fail even for localhost. It does seem to resolve on linux though.

webknjaz commented 3 years ago

So if localhost is your only IP address, AI_ADDRCONFIG is expected to fail even for localhost. It does seem to resolve on linux though.

Sounds like the Windows semantic differs from *NIX and may need to be special-cased.

webknjaz commented 3 years ago

cc @derlih ^

pwuertz commented 3 years ago

Here is the AI_ADDRCONFIG documentation for Linux for comparison. In my understanding, both documentations describe the same behavior for AI_ADDRCONFIG. If loopback is your only network interface, getaddrinfo should not resolve anything at all, not even 'localhost'.

Linux resolving 'localhost' anyways seems to be undocumented behavior (i.e. shouldn't be relied on?).

derlih commented 3 years ago

@pwuertz Currently I don't have a windows machine. Can you post a stacktrace for

import socket
socket.getaddrinfo('localhost', 1234, 0, socket.SOCK_STREAM, 0, socket.AI_ADDRCONFIG)

?

webknjaz commented 3 years ago

@derlih the linked doc suggests that the behavior may be the same for POSIX.

derlih commented 3 years ago

@webknjaz Today I've read some discussions related to similar issue in other projects. For example same function for Chromium uses some ifdefs to handle different behavior on different platforms. https://github.com/adobe/chromium/blob/cfe5bf0b51b1f6b9fe239c2a3c2f2364da9967d7/net/base/host_resolver_proc.cc#L122-L241

May be we should write the resolver code in the same way?

derlih commented 3 years ago

For Firefox https://github.com/servo/nspr/blob/8219c5c5137b94ae93cfb4afc1dc10a94384e1c6/pr/src/misc/prnetdb.c#L2012-L2035

pwuertz commented 3 years ago

Would you accept a PR for adding the localhost logic @derlih referenced in the Firefox codebase?

webknjaz commented 3 years ago

@pwuertz yes, we'll be open to reviewing such a change. But bear in mind that it's recommended to also cover this case with tests to prevent regressions and make it easier to reason about why this is needed in the future. The motivation explanation of "why" is also welcome as it'll help us preserve the hack across refactorings.

pwuertz commented 3 years ago

I transferred the solution used by Firefox to aiohttp here: https://github.com/pwuertz/aiohttp/commit/ee631e93720923a241417646373a885b1ca3a9e9

Unfortunately I cannot test it because I can't figure out to build aiohttp. After some searching I found a somewhat hidden make install-dev directive in some closed issue that seems to be required, but I don't have make on Windows.

Adding a test for this sounds difficult though. I guess one would have to add a testing step where a new Windows VM or Docker Image is started up without network, move aiohttp & tests into that, execute a new offline test subset. I don't know enough about the aiohttp CI infrastructure to do this.

derlih commented 3 years ago

Currently we are using GitHub Actions for CI. IMO for this test it makes sense to create a test_resolver_functional.py under tests folder. For testing this issue I would give Windows Docker a try. Looks like GitHub Actions Windows VMs already have a Windows Docker. And there is a Windows Python Docker official image.

So in theory it should work all together :-)

jameswilliams-modality commented 2 months ago

This turns out to be a critical issue when trying to use https://github.com/openfga/python-sdk locally with no internet as the connection between the python sdk and the database is via aiohttp. This issue has been open for over three years now, can this be resolved? I'm happy to take a stab at resolving it unless someone else more familiar with the contribution and build process can do so? I need to resolve this in a matter of days ideally.

Dreamsorcerer commented 2 months ago

I'm happy to take a stab at resolving it unless someone else more familiar with the contribution and build process can do so?

Yes, please go ahead. Note that the linked code at the top is in the ThreadedResolver, so a workaround is probably to just install aiodns (or aiohttp[speedups]). The AsyncResolver was disabled at the time of this issue, but has been enabled again in 3.10.