adoptium / infrastructure

This repo contains all information about machine maintenance.
Apache License 2.0
85 stars 101 forks source link

test-marist-sles15-s390x-2 - NoRouteToHostException: No Route to Host failures #2869

Open smlambert opened 1 year ago

smlambert commented 1 year ago
[2022-12-22T19:14:12.439Z] CL1 j> 2022/12/22 19:14:11.299 Failed to connect to Monitored VM after 30 attempts in 178 seconds - giving up.  Connection Exception received is below:
[2022-12-22T19:14:12.439Z] CL1 stderr java.rmi.ConnectIOException: Exception creating connection to: 148.100.74.154; nested exception is: 
[2022-12-22T19:14:12.439Z] CL1 stderr   java.net.NoRouteToHostException: No route to host (Host unreachable)
[2022-12-22T19:14:12.439Z] CL1 stderr   at java.rmi/sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:635)
[2022-12-22T19:14:12.439Z] CL1 stderr   at java.rmi/sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:209)
[2022-12-22T19:14:12.439Z] CL1 stderr   at java.rmi/sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:196)
[2022-12-22T19:14:12.439Z] CL1 stderr   at java.rmi/sun.rmi.server.UnicastRef.invoke(UnicastRef.java:132)
[2022-12-22T19:14:12.439Z] CL1 stderr   at java.management.rmi/javax.management.remote.rmi.RMIServerImpl_Stub.newClient(Unknown Source)
[2022-12-22T19:14:12.439Z] CL1 stderr   at java.management.rmi/javax.management.remote.rmi.RMIConnector.getConnection(RMIConnector.java:2105)
[2022-12-22T19:14:12.439Z] CL1 stderr   at java.management.rmi/javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:321)
[2022-12-22T19:14:12.439Z] CL1 stderr   at java.management/javax.management.remote.JMXConnectorFactory.connect(JMXConnectorFactory.java:270)
[2022-12-22T19:14:12.439Z] CL1 stderr   at java.management/javax.management.remote.JMXConnectorFactory.connect(JMXConnectorFactory.java:229)
[2022-12-22T19:14:12.439Z] CL1 stderr   at net.adoptopenjdk.test.jlm.remote.ServerConnector.doConnect(ServerConnector.java:276)
[2022-12-22T19:14:12.439Z] CL1 stderr   at net.adoptopenjdk.test.jlm.remote.ServerConnector.getServerConnection(ServerConnector.java:101)
[2022-12-22T19:14:12.439Z] CL1 stderr   at net.adoptopenjdk.test.jlm.remote.ServerConnector.<init>(ServerConnector.java:63)
[2022-12-22T19:14:12.439Z] CL1 stderr   at net.adoptopenjdk.test.jlm.remote.MemoryProfiler.<init>(MemoryProfiler.java:67)
[2022-12-22T19:14:12.439Z] CL1 stderr   at net.adoptopenjdk.test.jlm.remote.MemoryProfiler.main(MemoryProfiler.java:100)
[2022-12-22T19:14:12.439Z] CL1 stderr Caused by: java.net.NoRouteToHostException: No route to host (Host unreachable)

To make it easy for the infrastructure team to repeat and diagnose, please answer the following questions:

All TestJlmRemote system test targets and NioLoad system test targets fail (see TRSS view)

Any other details:

steelhead31 commented 1 year ago

Issue is that this machine cant create the jmx/rmi agent to connect too..

java -Xmx256m -Djavax.net.ssl.keyStore=/home/jenkins/workspace/Test_openjdk11_hs_sanity.system_s390x_linux/jvmtest/system/aqa-systemtest/openjdk.test.jlm/src/test.jlm/net/adoptopenjdk/test/jlm/testkeys -Djavax.net.ssl.trustStore=/home/jenkins/workspace/Test_openjdk11_hs_sanity.system_s390x_linux/jvmtest/system/aqa-systemtest/openjdk.test.jlm/src/test.jlm/net/adoptopenjdk/test/jlm/testkeys -Djavax.net.ssl.keyStoreType=JKS -Djavax.net.ssl.trustStoreType=JKS -Djavax.net.ssl.keyStorePassword=passphrase -Djavax.net.ssl.trustStorePassword=passphrase -Dcom.sun.management.jmxremote.password.file=/home/jenkins/workspace/Test_openjdk11_hs_sanity.system_s390x_linux/aqa-tests/TKG/output_16763999393111/TestJlmRemoteClassAuth_0/20230214-191102-TestJlmRemoteClassAuth/tmp/jmxremote.password -XX:+UseCompressedOops -classpath /home/jenkins/workspace/Test_openjdk11_hs_sanity.system_s390x_linux/jvmtest/system/aqa-systemtest/openjdk.test.jlm/bin:/home/jenkins/workspace/Test_openjdk11_hs_sanity.system_s390x_linux/jvmtest/system/systemtest_prereqs/junit/junit.jar net.adoptopenjdk.test.jlm.remote.ClassProfiler server /home/jenkins/workspace/Test_openjdk11_hs_sanity.system_s390x_linux/aqa-tests/TKG/output_16763999393111/TestJlmRemoteClassAuth_0/20230214-191102-TestJlmRemoteClassAuth/results/scls_server.log /home/jenkins/workspace/Test_openjdk11_hs_sanity.system_s390x_linux/aqa-tests/TKG/output_16763999393111/TestJlmRemoteClassAuth_0/20230214-191102-TestJlmRemoteClassAuth/results/scls_server.csv auth controlRole control1 localhost 1234

steelhead31 commented 1 year ago

Cannot connect to port localhost:1234 , even though no firewall is running.

sxa commented 1 year ago

I'm curious - does that command line work ok elsewhere? It has localhost 1234 at the end of it but when I had a quick play yesterday localhost seemed ok - it was only access via the external interfaces IPs that were being blocked (which is what the original failure description has) so I'm somewhat surprised it fails. Are you sure that the thing listening on 1234 is started by the test automatically? Does that command work on "known good" machines?

steelhead31 commented 1 year ago

I dug the command out of the test run output, as thats the point it was hanging...