Azure / azure-storage-node

Microsoft Azure Storage SDK for Node.js
http://azure.github.io/azure-storage-node/
Apache License 2.0
495 stars 226 forks source link

Timeouts and retry logic #119

Closed richardhuaaa closed 8 years ago

richardhuaaa commented 8 years ago

I've been using the retry logic provided with the module as follows:

var retryPolicy = new Azure.ExponentialRetryPolicyFilter(/*retryCount*/ 3, /*retryInterval*/ 500, /*minRetryInterval*/ 500, /*maxRetryInterval*/ 2000); 
var tableService = Azure.createTableService().withFilter(retryPolicy); 

With this code, my understanding is that if a request times out, we will re-send it up to 2 extra times, each time with a longer wait period in between. I have a few questions about the smaller details, that I am having difficulty putting together even after trying to look through the code (which is probably my own obtuseness):

  1. Does the 'timeout' concept refer to a network timeout or a storage timeout? In other words, when we are retrying, is it because the response took too long to return, or because the response returned with a TIMEOUT error (or either)?
  2. If it is a storage timeout, will retrying help? If it is a network timeout, won't a retry be problematic if it was a mutative operation, like an entity insertion? It would be unclear at that point if the initial one went through.
  3. What does the 'retryInterval' mean? Is it a wait period after the first request is sent (outbound), after which we assume it is timed out and issue a second? Or is it a wait period after a TIMEOUT response (inbound) to the first request? Or is it a wait period on top of a timeout parameter configured elsewhere?

Thanks so much for any help you can give!

yaxia commented 8 years ago
  1. Does the 'timeout' concept refer to a network timeout or a storage timeout? In other words, when we are retrying, is it because the response took too long to return, or because the response returned with a TIMEOUT error (or either)?

    Answer: "Timeout" could be either, but retry only happens when the response takes too long where the timeout can be set in the timeoutIntervalInMs option or uses the default value. It relies on the function return value. Please look into retry policy filter for details.

  2. If it is a storage timeout, will retrying help? If it is a network timeout, won't a retry be problematic if it was a mutative operation, like an entity insertion? It would be unclear at that point if the initial one went through.

    Answer: If I understand correctly, the network timeout you mentioned is the operation doesn't return instead of a response with error code. In this case, you will get the error return where there is no more retry. The timeout value is set in the maximumExecutionTimeInMs option. This timeout is checked before next retry.

  3. What does the 'retryInterval' mean? Is it a wait period after the first request is sent (outbound), after which we assume it is timed out and issue a second? Or is it a wait period after a TIMEOUT response (inbound) to the first request? Or is it a wait period on top of a timeout parameter configured elsewhere?

    Answer: In most of the cases, it is the wait period after a response, see here. There is an exception when the request location changes, please see the details here.