finos / symphony-bdk-java

The Symphony BDK (Bot Developer Kit) for Java helps you to create production-grade Chat Bots and Extension Applications on top of the Symphony REST APIs.
https://symphony-bdk-java.finos.org
Apache License 2.0
23 stars 69 forks source link

HTTP 500 errors should be retried #504

Closed symphony-elias closed 3 years ago

symphony-elias commented 3 years ago

Bug Report

During maintenance windows, when pod, km or agent is down, HTTP calls can result in HTTP 500. Such calls need to be retried. For instance, on one of our bots, we got:

2021-04-20 18:26:26.546  INFO 1 --- [_DatafeedThread] c.s.b.c.s.datafeed.impl.DatafeedLoopV1   : Recreate a new datafeed and try again
2021-04-20 18:26:26.684  INFO 1 --- [_DatafeedThread] .s.b.c.r.r.Resilience4jRetryWithRecovery : Retry in 64.0s...
2021-04-20 22:01:45.235  INFO 1 --- [_DatafeedThread] c.s.b.c.s.datafeed.impl.DatafeedLoopV1   : Recreate a new datafeed and try again
2021-04-20 22:01:45.389 ERROR 1 --- [_DatafeedThread] c.s.b.s.s.DatafeedAsyncLauncherService   : An API error has been received while starting the Datafeed loop in a separate thread, please check error below:

com.symphony.bdk.http.api.ApiException: {"code":500,"message":"Received an error when calling a pod endpoint"}
        at com.symphony.bdk.http.jersey2.ApiClientJersey2.invokeAPI(ApiClientJersey2.java:168) ~[symphony-bdk-http-jersey2-2.1.3.jar:2.1.3]
        at com.symphony.bdk.gen.api.DatafeedApi.v4DatafeedCreatePostWithHttpInfo(DatafeedApi.java:479) ~[symphony-bdk-core-2.1.3.jar:2.1.3]
        at com.symphony.bdk.gen.api.DatafeedApi.v4DatafeedCreatePost(DatafeedApi.java:414) ~[symphony-bdk-core-2.1.3.jar:2.1.3]
        at com.symphony.bdk.core.service.datafeed.impl.DatafeedLoopV1.createDatafeedAndPersist(DatafeedLoopV1.java:144) ~[symphony-bdk-core-2.1.3.jar:2.1.3]
        at com.symphony.bdk.core.retry.RetryWithRecovery.executeOnce(RetryWithRecovery.java:105) ~[symphony-bdk-core-2.1.3.jar:2.1.3]
        at io.github.resilience4j.retry.Retry.lambda$decorateCheckedSupplier$3f69f149$1(Retry.java:137) ~[resilience4j-retry-1.6.1.jar:1.6.1]
        at io.github.resilience4j.retry.Retry.executeCheckedSupplier(Retry.java:419) ~[resilience4j-retry-1.6.1.jar:1.6.1]
        at com.symphony.bdk.core.retry.resilience4j.Resilience4jRetryWithRecovery.execute(Resilience4jRetryWithRecovery.java:65) ~[symphony-bdk-core-2.1.3.jar:2.1.3]
        at com.symphony.bdk.core.service.datafeed.impl.DatafeedLoopV1.createDatafeed(DatafeedLoopV1.java:139) ~[symphony-bdk-core-2.1.3.jar:2.1.3]
        at com.symphony.bdk.core.service.datafeed.impl.DatafeedLoopV1.recreateDatafeed(DatafeedLoopV1.java:127) ~[symphony-bdk-core-2.1.3.jar:2.1.3]
        at com.symphony.bdk.core.retry.RecoveryStrategy.runRecovery(RecoveryStrategy.java:48) ~[symphony-bdk-core-2.1.3.jar:2.1.3]
        at com.symphony.bdk.core.retry.RetryWithRecovery.handleRecovery(RetryWithRecovery.java:159) ~[symphony-bdk-core-2.1.3.jar:2.1.3]
        at com.symphony.bdk.core.retry.RetryWithRecovery.executeOnce(RetryWithRecovery.java:112) ~[symphony-bdk-core-2.1.3.jar:2.1.3]
        at io.github.resilience4j.retry.Retry.lambda$decorateCheckedSupplier$3f69f149$1(Retry.java:137) ~[resilience4j-retry-1.6.1.jar:1.6.1]
        at io.github.resilience4j.retry.Retry.executeCheckedSupplier(Retry.java:419) ~[resilience4j-retry-1.6.1.jar:1.6.1]
        at com.symphony.bdk.core.retry.resilience4j.Resilience4jRetryWithRecovery.execute(Resilience4jRetryWithRecovery.java:65) ~[symphony-bdk-core-2.1.3.jar:2.1.3]
        at com.symphony.bdk.core.service.datafeed.impl.DatafeedLoopV1.readDatafeed(DatafeedLoopV1.java:121) ~[symphony-bdk-core-2.1.3.jar:2.1.3]
        at com.symphony.bdk.core.service.datafeed.impl.DatafeedLoopV1.start(DatafeedLoopV1.java:86) ~[symphony-bdk-core-2.1.3.jar:2.1.3]
        at com.symphony.bdk.spring.service.DatafeedAsyncLauncherService.uncheckedStart(DatafeedAsyncLauncherService.java:88) ~[symphony-bdk-core-spring-boot-starter-2.1.3.jar:2.1.3]
        at com.symphony.bdk.http.api.tracing.MDCUtils$MdcRunnable.run(MDCUtils.java:59) ~[symphony-bdk-http-api-2.1.3.jar:2.1.3]
        at java.base/java.lang.Thread.run(Unknown Source) ~[na:na]

Expected Result:

HTTP 500 errors should be retried, see: https://github.com/finos/symphony-bdk-java/blob/main/symphony-bdk-http/symphony-bdk-http-api/src/main/java/com/symphony/bdk/http/api/ApiException.java#L81

Actual Result:

HTTP 500 errors lead to bot failing.

SivaTharun commented 3 years ago

@symphony-elias i am a newbie to open source contribution, can i start working on this issue.

thibauult commented 3 years ago

No problem @SivaTharun, your contribution will be very welcome!

Do you need any help before starting?

SivaTharun commented 3 years ago

Thanks @symphony-thibault for following up, i just need to know how to reproduce the above scenario, if you can point me to a unit test, which asserts for the 500 error from DatafeedAsyncLauncherService class (or) you can point to me to the resource for reproducing the above exception.