Open youngm opened 8 months ago
Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @EldertGrootenboer.
@anuchandy please take a look
@anuchandy I tried this with the release in bom 1.2.23 and I experienced the same issue.
Digging a little deeper it seems that the command retries forever until it times out. For some reason, retry count isn't being respected.
bump please @anuchandy?
Hi @youngm, thank you for reaching out and I'm sorry for the late response.
To share some background, in 7.14.x and earlier, certain race conditions in connection-processor (the connection cache) didn't converge the downstream connection requests, that result in some (not all) downstream clients to observe the error in connection.
Those same races in the 7.14.x connection processor were affecting clients (sender, receiver, processor etc.) leading to serious recovery failures and incidents. Hence, in version 7.15.x, we revamped connection caching. In the new connection caching, we picked a simple design where it will keep retry on "retriable AMQP-Connection error", without propagating it to downstream clients. While service endpoint not reachable can happen either due to temporary network issues / service temporarily unavailable (more likely) or due to app using non-existent endpoint (less likely), the underlying stack cannot distinguish these two, so become retriable.
There can be many downstream client's AMQP-Links hosted in the shared AMQP-Connection, propagating AMQP-Connection error downstream means all clients backoff, retry requiring more coordination at the expense of more scheduling/timers bringing additional overhead. So, we had to make this trade-off of localizing the AMQP-Connection retriable-error.
Currently, we lack a solid design that address your concerns without adversely affecting common use cases, which were impacted in 7.14.x :(
Thanks for the details response @anuchandy this mostly is a problem for me when I'm testing credentials. Is there a way I can shorten this timeout when I do my connection test? Even if the settings may be sub-optimal for regular use?
Hi @youngm, do you mean verifying whether the credential has sufficient permission to perform an operation? If the fully qualified namespace of the Service Bus is valid and we only want to check if credential has certain permission (say "Send" permission) then we should be able do that, for example –
final String namespace = "<a-valid-namespace-name>.servicebus.windows.net";
final String policyName = "policy0"; // Policy with only "Listen" permission NO "Send".
final String key = "<key>" // key for the policy, "policy0" with only "Listen" permission NO "Send".
ServiceBusSenderClient client =
new ServiceBusClientBuilder()
.fullyQualifiedNamespace(namespace)
.credential(new ServiceBusSharedKeyCredential(policyName, key))
.sender()
.queueName("queue0")
.buildClient();
client.createMessageBatch();
This will throw the error -
com.azure.messaging.servicebus.ServiceBusException: Unauthorized access. 'Send' claim(s) are required to perform this operation. Resource: 'sb://<a-valid-namespace-name>.servicebus.windows.net/queue0'. TrackingId:<tracking-id>, Timestamp:<time-stamp>, errorContext[NAMESPACE:
.servicebus.windows.net. ERROR CONTEXT: N/A, PATH: queue0, REFERENCE_ID: queue0, LINK_CREDIT: 0]
Thanks @anuchandy I'd also like to be able to validate the namespace. Let me know if you know some way to do that as well.
Hi @youngm, we can attempt namespace resolution, for example
try {
final java.net.InetAddress ignored = java.net.InetAddress.getByName("<non-existing-namespace>.servicebus.windows.net");
} catch (java.net.UnknownHostException e) {
System.out.println("Host resolution failed: " + e.getMessage());
}
Thanks @anuchandy for confirming that's as good as I can do at this point. I hope you will be able to fix this eventually. Thanks!
Describe the bug If I create a ServiceBusClient 7.15.x with an invalid namespace (or something similar) then when I try to use the client to make a synchronous call I get a Timeout error instead of a ServiceBusException. With 7.14.x I immediately get a ServiceBusException.
Exception or Stack Trace The error I get is:
In the logs but it appears to be ignored by the client:
To Reproduce Try to use a client with bad connection information like a bad namespace in a synchronous call.
Code Snippet
Expected behavior A quick ServiceBusException thrown