appsmithorg / appsmith

Platform to build admin panels, internal tools, and dashboards. Integrates with 25+ databases and any API.
https://www.appsmith.com
Apache License 2.0
33.77k stars 3.63k forks source link

[Feature]: Implement retry mechanism for CS calls while fetching the feature flags #28622

Open abhvsn opened 10 months ago

abhvsn commented 10 months ago

Is there an existing issue for this?

Summary

We have seen the case where the license was on paid plan but still the flags were not present on Appsmith instance, as the CS call to fetch the flags failed. Slack thread: https://theappsmith.slack.com/archives/C04H5PFRN1H/p1698945473164529

This ticket discusses implementing a mechanism for multiple retries when CS calls fail. To start with we can have 3 retries 1sec apart, if these fails then we can assume the CS is down, and wait for the user to fetch the /features call from UI which will again triggers the get features flow.

Also refresh CTA on license and billing screen should fetch the latest flags as it's intuitive if something is out of sync refresh should bring the license and flags status in sync.

Why should this be worked on?

As feature flags are integral to provide the business experience to the end users, we can't afford to fail CS call to fetch the flags. Though we have implemented the fallback mechanism it still has prerequisites like at least the first call needs to be successful.

abhvsn commented 10 months ago

Worth looking at: https://projectreactor.io/docs/netty/release/reference/index.html#faq.connection-closed

abhvsn commented 10 months ago

More logs when the same issue was experienced in local:

 Received error from CS while fetching features: Received error from cloud services release-cs.appsmith.com: nodename nor servname provided, or not known
com.appsmith.server.exceptions.AppsmithException: Received error from cloud services release-cs.appsmith.com: nodename nor servname provided, or not known
    at com.appsmith.server.services.ce.CacheableFeatureFlagHelperCEImpl.lambda$getRemoteFeaturesForTenant$14(CacheableFeatureFlagHelperCEImpl.java:252)
    at reactor.core.publisher.Mono.lambda$onErrorMap$27(Mono.java:3769)
    at reactor.core.publisher.Mono.lambda$onErrorResume$29(Mono.java:3859)