oVirt / vdsm-jsonrpc-java

JSON RPC Java client for oVirt
8 stars 11 forks source link

SSL session is invalid #27

Closed dupondje closed 1 year ago

dupondje commented 1 year ago

We had some connection issue between the ovirt-engine and the hosts.

Now we noticed the hosts didn't got reconnected after resolving the issue. Checking the logs show the following:

2022-11-06 14:50:36,359+01 INFO  [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor) [] Connecting to /x.243
2022-11-06 14:50:36,359+01 INFO  [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor) [] Connected to /x.243:54321
2022-11-06 14:50:36,816+01 ERROR [org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-82) [] Unable to RefreshCapabilities: ClientConnectionException: SSL session is invalid
2022-11-06 14:50:36,816+01 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesAsyncVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-82) [] Command 'GetCapabilitiesAsyncVDSCommand(HostName = ovn017, VdsIdAndVdsVDSCommandParametersBase:{hostId='d983da9a-b707-4b87-95c6-5d5ac047421f', vds='Host[ovn017,d983da9a-b707-4b87-95c6-5d5ac047421f]'})' execution failed: org.ovirt.vdsm.jsonrpc.client.ClientConnectionException: SSL session is invalid

It might be related to https://github.com/oVirt/vdsm-jsonrpc-java/pull/17 ?

After restarting the ovirt-engine it started working again. But would be cleaner if it would reconnect by itself :)

Thanks

dupondje commented 1 year ago

Most likely this was still during connection issues. And seems like it does only retry to connect 3 times or so? That could explain it.

dupondje commented 1 year ago

Had this issue again today, and decided to troubleshoot it further.

Connection failed:

2023-05-16 01:08:00,233+02 WARN  [org.ovirt.engine.core.vdsbroker.VdsManager] (EE-ManagedThreadFactory-engine-Thread-18) [] Host 'xxxxx' is not responding. It will stay in Connecting state for a grace period of 78 seconds and after that an attempt to fence the host will be issued.
2023-05-16 01:08:00,237+02 ERROR [org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-31) [] Unable to GetStats: VDSNetworkException: VDSGenericException: VDSNetworkException: Connection timeout for host 'x.x.x.240', last response arrived 17047 ms ago.

It reconnects again:

2023-05-16 01:08:01,382+02 INFO  [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor) [] Connecting to /x.x.x.240
2023-05-16 01:08:01,382+02 INFO  [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor) [] Connected to /x.x.x.240:54321

But then RefreshCapabilities fails:

2023-05-16 01:08:08,953+02 ERROR [org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-71) [] Unable to RefreshCapabilities: ClientConnectionException: SSL session is invalid

And this seems to cause the HostConnectionRefresher to stop Refreshing the VDS, and causing the host to stay in Connecting state forever. This is because the EventPublisher stopped in the jsonrpc?

The HostMonitoringWatch dog confirms:

2023-05-16 01:41:20,177+02 WARN  [org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoringWatchdog] (EE-ManagedScheduledExecutorService-engineThreadMonitoringThreadPool-Thread-1) [] Monitoring not executed for the host x.x.x.240 [614c7aea-faca-4b2c-a521-6fb8e564fa56] for 2007762ms

Maybe @pkliczewski has some idea?