Open mgorny opened 4 years ago
@jacobtomlinson I'm not sure what your priorities are like these days. But would this be easy for you to resolve?
Looking through the log some of these issues appear to be related to IPv6 rather than a lack of internet connection.
@mgorny could you share a little more about your network setup, what interfaces you have and what IP addresses any active interfaces have when you ran these tests?
@mrocklin fixed some issues with importing distributed without a network here: https://github.com/dask/distributed/pull/3991. Perhaps a similar solution of try/execpt with defaults to 127.0.0.1 would also work here
@mgorny could you share a little more about your network setup, what interfaces you have and what IP addresses any active interfaces have when you ran these tests?
I'm running tests inside network namespace, with only lo
interface set up, i.e. roughly:
$ sudo unshare -n bash
# ifconfig lo up
# ifconfig lo
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1000 (Local Loopback)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
The hostname is also reset to localhost
.
Hi,
I was trying to get Distributed 202101 to run on Debian's testing infrastructure and I had this or a closely related problem. Our test runners have a working ipv6 loop back but don't have a routable ipv6 address or a ipv6 address for the default hostname.
has_ipv6 checks that ipv6 is enabled on the loopback interface, but then in test_comms.py there's this:
EXTERNAL_IP4 = get_ip()
if has_ipv6():
with warnings.catch_warnings(record=True):
warnings.simplefilter("always")
EXTERNAL_IP6 = get_ipv6()
get_ipv6()
tries to open a dgram socket to "2001:4860:4860::8888" but it seems like that fails because there's no routing table connecting available to connect to 2001::
On the Debian test systems gethostname() returns something that doesn't have an ipv6 address attached to it so this block also fails with socket.gaierror: [Errno -5] No address associated with hostname
addr_info = socket.getaddrinfo(
socket.gethostname(), port, family, socket.SOCK_DGRAM, socket.IPPROTO_UDP
)[0]
My first temptation is to put a call to get_ipv6()
into has_ipv6()
so it checks to see it has a routable ipv6 address.
Though I'm wondering if has_ipv6()
should be split into a loop back only check and a separate check for a routable address.
My other idea would be to extend the fail over code in _get_ip
to try gethostname()
, and then if that doesn't work fail over to using the hostname "localhost" or "ip6-localhost".
Do you have any thoughts which would be better?
What happened:
Tests fail immediately when the system does not have Internet access:
What you expected to happen:
I expected at least subset of tests to be usable in network-constrained environments.
Anything else we need to know?:
It seems that replacing
has_ipv6()
with explicitFalse
helps it get past initial error. However, the majority of tests still fail because of network failures, e.g.:Even if I skip tests that fail immediately, a lot of tests error out during teardown:
Environment: