n0-computer / iroh

A toolkit for building distributed applications
https://iroh.computer
Apache License 2.0
2.57k stars 161 forks source link

test_icmp_probe_eu_derper flaky on windows #2069

Closed rklaehn closed 8 months ago

rklaehn commented 8 months ago

It seems that on windows sometimes we can't do this probe. The derper looks healthy, and if it was down all the other tests would fail as well. We have seen this not just in tests but also in real life.

Possibly related: https://github.com/n0-computer/dumbpipe/issues/17

image

flub commented 8 months ago

See also the iroh-net/src/dns.rs tests in #2073.

The issue seems to be here that DNS resolution does completely fail on windows at times. We have no idea why yet. But without DNS resolution there's not anything that's going to work.

flub commented 8 months ago

https://github.com/n0-computer/iroh/actions/runs/8245723076/job/22550253677

The problem is that the system config choses the fec0:0:0:ffff::3 (and ...::1 and ...::2) DNS server for some reason. But IPv6 is not routable on the machine, probably because it's a dual-stack machine with no IPv6 connectivity, like the vast majority of systems in the world.

On our CI machine it also tries an IPv4 server fairly quickly after, but the whole DNS lookup has a limit of 1s timeout so the thing fails before we get a response from a working server as the resolver spend too much time on the broken server probably.

flub commented 8 months ago

So it seems fec0:0:0:ffff::1 (and 2 & 3) are deprecated site-local anycast addresses that microsoft DNS servers might listen on. How that gets in those windows boxes I still have no clue.

flub commented 8 months ago

Ah, I can find the resolvers configured on the host when using Get-DnsClientServerAddress in powershell.