Closed adam-thorpe closed 4 years ago
Each machine seems to find the same few IP addresses, so the godaddy c7 machine and softlayer rhel machines I tested on outputted the same IP, but godaddy ubuntu was different (I won't put the IPs here in case they are important). These don't seem to be linked to the hostname as changing that didn't effect the results.
IPs are all in https://github.com/AdoptOpenJDK/openjdk-infrastructure/blob/master/ansible/inventory.yml so not especially sensitive - can you provide a sample java app (just a few lines of real code) with the names that are producing undesirable results to aid debugging of this please?
Other architectures do seem to be failing this test, however only a couple machines seem to be affected. test-marist-ubuntu1604-s390x-1 is an example of a passing z linux box
Like the docker issues we've seen this appears to be something specific to the godaddy hosting infrastructure:
adoptopenjdk@test-godaddy-ubuntu1604-x64-4:~$ ping -c 1 randomhostnamestring
PING randomhostnamestring.dc1.corp.gd (185.53.178.6) 56(84) bytes of data.
64 bytes from 185.53.178.6: icmp_seq=1 ttl=46 time=17.8 ms
--- randomhostnamestring.dc1.corp.gd ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 17.847/17.847/17.847/0.000 ms
adoptopenjdk@test-godaddy-ubuntu1604-x64-4:~$ ping -c 1 completely.different.host.name
PING completely.different.host.name.dc1.corp.gd (185.53.178.6) 56(84) bytes of data.
64 bytes from 185.53.178.6: icmp_seq=1 ttl=46 time=17.6 ms
--- completely.different.host.name.dc1.corp.gd ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 17.695/17.695/17.695/0.000 ms
adoptopenjdk@test-godaddy-ubuntu1604-x64-4:~$
Not sure if there's a lot we can do about it unless we try to switch the DNS away from the ones they're configured with or possibly remove dc1.corp.gd
from the DNS search list.
As per https://adoptopenjdk.slack.com/archives/C53GHCXL4/p1573832895014600 Demetrius seems happy with us modifying the DNS configuration so I will look at doing that at the start of next week which will hopefully resolve this.
Have removed dc1.corp.gd
and hosting.cop.hd
from /etc/resolv.conf
on t he four GoDaddy Ubuntu 16.04 machines which won't be permanent but will hopefully let us see if it passes tonight
Not sure that this can be closed yet, I assume that @sxa555 will need to make a permanent change to the boxes
Yes there are other boxes that still fail this test and have not had the same change implemented on them
@adam-thorpe Which machines? Are you seeing it on the non-Ubuntu GoDaddy ones too?
@sxa555 Yes, I retested this on a GoDaddy debian8 machine which still fails: https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/1100/ I'm fairly sure there were a couple more that aren't even GoDaddy machines like the Softlayer Rhel ones. I can try to gather a list if you'd like however I'm pretty sure a large number of boxes are effected
OK thanks - looks like the SL RHEL ones are seeing the issue because the adoptopenjdk.net domain similarly resolves any DNS request underneath it ...
Edited /etc/network/interfaces.d/hfs*
to remove dc1.corp.gd
and hosting.cop.hd
from the dns-search
line on the ubuntu 2-4 machines. My credentials for the adoptopenjdk
user doesn't seem to work on the -1
ubuntu machine though. @gdams is the password on that one different? If so please send me the new one somehow as these machines don't have the admin team's ssh keys installed
We need to look at how to resolve this for the adoptopenjdk.net
domain since many other machines are configured with that as their default domain and will experience the same symptoms
@gdams and I will look at it, should be a *.domainname problem.
Fixed. LMK if that works.
A quick test on the machines previously affected show that the problem no longer exists (possibly except for the entries on the godaddy-1 ubuntu machine which I can't access) so I think we're good almost everywhere now.
I'll un-exclude the test then and see if it starts passing in the nightlies. This may have affected a bunch of tests which would be nice
When creating an InetSocketAddress(String host, int port) object, the constructor will pass the host name to InetAddress to see if it is a valid address. If the host name cannot be found, it is marked as unresolved (set to null), which can be tested via the isUnresolved() method. It would seem that addresses are not being marked correctly on the x64 linux machines. Consistent on both openj9 and hotspot.
Test: java/nio/channels/SocketChannel/ExceptionTranslation.java This test is attempting to connect to an invalid host address and is ensuring that it throws an UnknownHostException. However the connect() method hangs and throws: