ocaml / infrastructure

WIki to hold the information about the machine resources available to OCaml.org
40 stars 9 forks source link

Ipv6 routing not working for opam.ocaml.org #42

Open reynir opened 1 year ago

reynir commented 1 year ago

Hello! I am unable to run opam init on an ipv6-only host due to a routing issue to opam.ocaml.org. The domain resolves to 2001:bc8:5080:8e02::1 and 2001:bc8:1d80:4600::1. When I try to run tracepath it seems to go into a loop:

$ tracepath  2001:bc8:1d80:4600::1
[...SNIP...]
 9:  2001:1900:5:2:2:0:11c:6b2                            51.250ms asymm  8 
10:  2001:bc8:1d00:1::1                                   51.624ms asymm 11 
11:  2001:bc8:1d10:4::1                                   52.090ms 
12:  2001:bc8:1d10:4::6                                   49.801ms asymm 11 
13:  2001:bc8:1d10:4::1                                   52.756ms asymm 12 
14:  2001:bc8:1d10:4::6                                   51.107ms asymm 11 
15:  2001:bc8:1d10:4::1                                   52.004ms asymm 11 
16:  2001:bc8:1d10:4::6                                   49.985ms asymm 11 
17:  2001:bc8:1d10:4::1                                   52.133ms asymm 12 
18:  2001:bc8:1d10:4::6                                   51.931ms asymm 11 
19:  2001:bc8:1d10:4::1                                   51.175ms asymm 11 
20:  2001:bc8:1d10:4::6                                   50.132ms asymm 11 
21:  2001:bc8:1d10:4::1                                   51.262ms asymm 12 
22:  2001:bc8:1d10:4::6                                   50.977ms asymm 11 
23:  2001:bc8:1d10:4::1                                   51.414ms asymm 11 
24:  2001:bc8:1d10:4::6                                   51.919ms asymm 11 
25:  2001:bc8:1d10:4::1                                   51.293ms asymm 12 
26:  2001:bc8:1d10:4::6                                   51.969ms asymm 11 
27:  2001:bc8:1d10:4::1                                   52.046ms asymm 11 
28:  2001:bc8:1d10:4::6                                   50.070ms asymm 11 
29:  2001:bc8:1d10:4::1                                   53.301ms asymm 12 
30:  2001:bc8:1d10:4::6                                   51.972ms asymm 11 
     Too many hops: pmtu 1500
     Resume: pmtu 1500
mtelvers commented 1 year ago

@avsm The IPv6 addresses of these two machines appear to have changed. The current values are 2001:bc8:5080:a405::1 (opam-4) and 2001:bc8:1d80:4a00::1 (opam-5). Please can you update the DNS?

avsm commented 1 year ago

Now updated, and worryingly that went quite a long time without being noticed. It might be worth having a specific healthcheck somewhere for an IPv6-specific connection @mtelvers. Quite hard to spot this manually unless in an IPv6 only network (the only one of those I have is Mythic Beasts rPi hosting)

mtelvers commented 1 year ago

@avsm The IPv6 addresses appear to have changed again. opam-4 is now 2001:bc8:5080:a405::1 and opam-5 is now 2001:bc8:1d80:4a00::1.

hannesm commented 1 year ago

Uhm, I don't quite understand your setup... But can't you statically configure IPv6 addresses, and advertise them in DNS?

avsm commented 1 year ago

@hannesm there's something wrong with the new Scaleway setup -- those advertised addresses shouldnt be changing.

avsm commented 1 year ago

@mtelvers the AAAA records already matched the ones you posted. They don't appear to have changed again -- what records are you seeing?

mtelvers commented 1 year ago

@avsm I used https://www.ssllabs.com/ssltest/analyze.html?d=opam.ocaml.org.

This was suggested by @hannesm on https://github.com/ocaml/opam/issues/5550#issuecomment-1547326886

mtelvers commented 1 year ago

image

mtelvers commented 1 year ago

@avsm I can also see the wrong address via

$ nslookup opam.ocaml.org 8.8.8.8
Server:     8.8.8.8
Address:    8.8.8.8#53

Non-authoritative answer:
Name:   opam.ocaml.org
Address: 151.115.76.159
Name:   opam.ocaml.org
Address: 51.158.232.133
Name:   opam.ocaml.org
Address: 2001:bc8:1d80:4600::1
Name:   opam.ocaml.org
Address: 2001:bc8:5080:8e02::1
reynir commented 1 year ago

It took me a while to figure out where I could report my issue. I first asked on #ocaml on libera.chat, and eventually I found this repository (that I have encountered before through #27).

I usually have ipv4 (often only ipv4), but I had spun up a virtual machine and chose to save a few cents by going ipv6 only. This is how I found out. I also tried to work around the issue by using the git repository, and by running an opam-mirr. However, due to GitHub not supporting ipv6 (another possible explanation why no one on ipv6-only connections have reported this problem) the git repository opam-repository was not accessible as well as a large portion of the source code and archives including the OCaml compiler itself. All in all opam seems a little brittle on ipv6. This probably deserves a separate issue.

As for testing I could imagine using OPAMFETCH with a curl, wget, fetch,... invocation that forces ipv6-only should do on a dual stack host - thereby not requiring an ipv6-only host for performing the test.

mtelvers commented 1 year ago

@reynir Yes, I found that test all the options using these four commands (on a dual-stack machine)

curl --resolve opam.ocaml.org:443:[2001:bc8:1d80:4a00::1] -o /dev/null https://opam.ocaml.org
curl --resolve opam.ocaml.org:443:[2001:bc8:5080:a405::1] -o /dev/null https://opam.ocaml.org
curl --resolve opam.ocaml.org:443:151.115.76.159 -o /dev/null https://opam.ocaml.org
curl --resolve opam.ocaml.org:443:51.158.232.133 -o /dev/null https://opam.ocaml.org

At the moment, only 3 out of 4 of these addresses match the published DNS entries. The last one in this output is wrong.

$ nslookup opam.ocaml.org ns1.gandi.net
Server:     ns1.gandi.net
Address:    173.246.100.2#53

Name:   opam.ocaml.org
Address: 151.115.76.159
Name:   opam.ocaml.org
Address: 51.158.232.133
Name:   opam.ocaml.org
Address: 2001:bc8:1d80:4a00::1
Name:   opam.ocaml.org
Address: 2001:bc8:5080:8e02::1
avsm commented 1 year ago

Right, I looked in my JavaScript console on Gandi, and found internal 500 errors reported from the web UI. It looks like there was a glitch in the update UI for Gandi itself, so some of the changes just didn't go through. I've made the changes again, and they should all be propagating now.

I'm also surprised by the lack of GitHub IPv6-only support; see https://github.com/orgs/community/discussions/10539. It's something we can only partially fix via #29 since it doesn't solve the issue of how to create issues, even if we mirror our source code.

avsm commented 1 year ago

Mark, I'll give this one back to you to see if you'd like an IPv6-healthcheck. Otherwise it should be sorted now I think.

mtelvers commented 1 year ago

@avsm I have written an OCurrent pipeline that resolves the name, validates the certificates, tries to download from the website, and then posts the results to a Slack channel. http://observer.ocamllabs.io

cc. @tmcgilchrist

mtelvers commented 1 year ago

@avsm The IPv6 address of staging.ocaml.org should be 2001:bc8:1204:a40b::1 rather than 2001:bc8:1202:920c::1. Please could you update the DNS entry?

avsm commented 1 year ago

I have written an OCurrent pipeline that resolves the name, validates the certificates, tries to download from the website, and then posts the results to a Slack channel. http://observer.ocamllabs.io/

Looks great! Don't forget to add to ocurrent/overview when you upload the source code. This is distinct from the deployer, I presume?