There is no way to configure the behavior of retries when sending a request; it will essentially retry until either the sending succeeds or the context is cancelled.
The cited problem with this approach is that callers are oblivious to scaling issues, i.e. their EH is too small for their workload and we keep swallowing "server busy" errors so they are unaware unless they manually look at metrics in the portal (perhaps there's a programmatic way to achieve this).
It could be as simple as making a retry cap available to callers. Another option (probably the preferred solution) would be to provide a way for callers to supply their own retry policy so they have complete control over how errors are handled.
There is no way to configure the behavior of retries when sending a request; it will essentially retry until either the sending succeeds or the context is cancelled. The cited problem with this approach is that callers are oblivious to scaling issues, i.e. their EH is too small for their workload and we keep swallowing "server busy" errors so they are unaware unless they manually look at metrics in the portal (perhaps there's a programmatic way to achieve this). It could be as simple as making a retry cap available to callers. Another option (probably the preferred solution) would be to provide a way for callers to supply their own retry policy so they have complete control over how errors are handled.