finos / symphony-bdk-java

The Symphony BDK (Bot Developer Kit) for Java helps you to create production-grade Chat Bots and Extension Applications on top of the Symphony REST APIs.
https://symphony-bdk-java.finos.org
Apache License 2.0
23 stars 67 forks source link

Symphony Bdk Datafeed service fails to reconnect in java.net.SocketException #675

Closed vladokrsymphony closed 1 year ago

vladokrsymphony commented 2 years ago

Bug Report

As observed we found that if the BOT is running and the Agent gets restarted or upgraded the BOT will then be not able to reconnect successfully to the data feed in case of java.net.SocketException

Expected Result:

Once the Agent is back the BOT should reconnect automatically to the Data Feed.

Actual Result:

Bot is not reconnecting to the data feed and it continues to retry and retry. At the end customers need to restart the BOT so that it reconnects successfully.

Environment:

Java BDK 2.9.0

Additional Context:

2022-08-24T09:59:37,892Z INFO  [SymphonyBdk_Datafeed-77] com.symphony.bdk.core.retry.resilience4j.Resilience4jRetryWithRecovery : YQs7gf -  Retry in 2.0s...  
2022-08-24T09:59:39,980Z INFO  [SymphonyBdk_Datafeed-77] org.apache.http.impl.execchain.RetryExec : YQs7gf -  I/O exception (java.net.SocketException) caught when processing request to {s}->https://some-pod.com:443: Connection reset  
2022-08-24T09:59:39,980Z INFO  [SymphonyBdk_Datafeed-77] org.apache.http.impl.execchain.RetryExec : YQs7gf -  Retrying request to {s}->https://some-pod.com:443  
2022-08-24T09:59:40,065Z INFO  [SymphonyBdk_Datafeed-77] org.apache.http.impl.execchain.RetryExec : YQs7gf -  I/O exception (java.net.SocketException) caught when processing request to {s}->https://some-pod.com:443: Connection reset  
2022-08-24T09:59:40,065Z INFO  [SymphonyBdk_Datafeed-77] org.apache.http.impl.execchain.RetryExec : YQs7gf -  Retrying request to {s}->https://some-pod.com:443  
2022-08-24T09:59:40,148Z INFO  [SymphonyBdk_Datafeed-77] org.apache.http.impl.execchain.RetryExec : YQs7gf -  I/O exception (java.net.SocketException) caught when processing request to {s}->https://some-pod.com:443: Connection reset  
2022-08-24T09:59:40,148Z INFO  [SymphonyBdk_Datafeed-77] org.apache.http.impl.execchain.RetryExec : YQs7gf -  Retrying request to {s}->https://some-pod.com:443  
2022-08-24T09:59:40,284Z INFO  [SymphonyBdk_Datafeed-77] com.symphony.bdk.core.retry.resilience4j.Resilience4jRetryWithRecovery : YQs7gf -  Retry in 4.0s...  
2022-08-24T09:59:44,292Z INFO  [SymphonyBdk_Datafeed-77] org.apache.http.impl.execchain.RetryExec : YQs7gf -  I/O exception (java.net.SocketException) caught when processing request to {s}->https://some-pod.com:443: Connection reset  
2022-08-24T09:59:44,293Z INFO  [SymphonyBdk_Datafeed-77] org.apache.http.impl.execchain.RetryExec : YQs7gf -  Retrying request to {s}->https://some-pod.com:443  
2022-08-24T09:59:44,299Z INFO  [SymphonyBdk_Datafeed-77] org.apache.http.impl.execchain.RetryExec : YQs7gf -  I/O exception (java.net.SocketException) caught when processing request to {s}->https://some-pod.com:443: Connection reset  
2022-08-24T09:59:44,299Z INFO  [SymphonyBdk_Datafeed-77] org.apache.http.impl.execchain.RetryExec : YQs7gf -  Retrying request to {s}->https://some-pod.com:443  
2022-08-24T09:59:44,306Z INFO  [SymphonyBdk_Datafeed-77] org.apache.http.impl.execchain.RetryExec : YQs7gf -  I/O exception (java.net.SocketException) caught when processing request to {s}->https://some-pod.com:443: Connection reset  
2022-08-24T09:59:44,306Z INFO  [SymphonyBdk_Datafeed-77] org.apache.http.impl.execchain.RetryExec : YQs7gf -  Retrying request to {s}->https://some-pod.com:443  
2022-08-24T09:59:44,313Z INFO  [SymphonyBdk_Datafeed-77] com.symphony.bdk.core.retry.resilience4j.Resilience4jRetryWithRecovery : YQs7gf -  Retry in 8.0s...  
2022-08-24T09:59:52,321Z INFO  [SymphonyBdk_Datafeed-77] org.apache.http.impl.execchain.RetryExec : YQs7gf -  I/O exception (java.net.SocketException) caught when processing request to {s}->https://some-pod.com:443: Connection reset  
2022-08-24T09:59:52,321Z INFO  [SymphonyBdk_Datafeed-77] org.apache.http.impl.execchain.RetryExec : YQs7gf -  Retrying request to {s}->https://some-pod.com:443  
2022-08-24T09:59:52,330Z INFO  [SymphonyBdk_Datafeed-77] org.apache.http.impl.execchain.RetryExec : YQs7gf -  I/O exception (java.net.SocketException) caught when processing request to {s}->https://some-pod.com:443: Connection reset  
2022-08-24T09:59:52,330Z INFO  [SymphonyBdk_Datafeed-77] org.apache.http.impl.execchain.RetryExec : YQs7gf -  Retrying request to {s}->https://some-pod.com:443  
2022-08-24T09:59:52,337Z INFO  [SymphonyBdk_Datafeed-77] org.apache.http.impl.execchain.RetryExec : YQs7gf -  I/O exception (java.net.SocketException) caught when processing request to {s}->https://some-pod.com:443: Connection reset  
2022-08-24T09:59:52,337Z INFO  [SymphonyBdk_Datafeed-77] org.apache.http.impl.execchain.RetryExec : YQs7gf -  Retrying request to {s}->https://some-pod.com:443  
2022-08-24T09:59:52,345Z INFO  [SymphonyBdk_Datafeed-77] com.symphony.bdk.core.retry.resilience4j.Resilience4jRetryWithRecovery : YQs7gf -  Retry in 16.0s...  
ystxn commented 1 year ago

Extensive investigations were performed by @yinan-symphony and myself that were inconclusive in determining a root cause of this issue. It would appear to be an edge case that happens only in very specific network environments. Closing for now and we can revisit when more information is available on how this can be prevented.