Closed mriedem closed 2 years ago
This is a response I got in Slack:
As you mention that AWS docs says that every request is retried, yes this is correct and we can configure the retry count when creating a client and if there is a connection timeout then it will retry. So every client call(Ex: createBucket,listBuckets…) comes under this.
Note that we don't set the maxRetries
config option but I'm assuming it should be coming through the default config:
https://github.com/IBM/ibm-cos-sdk-js/blob/1.10.0/ts/config.ts#L59
Here is an example of the COS object after it's constructed from our config (in this case a local dev/test config):
{
"config":{
"credentials":{
"expired":true,
"expireTime":null,
"refreshCallbacks":[
],
"serviceInstanceId":"<snip>",
"accessKeyId":"<snip>",
"httpOptions":{
"timeout":60000
}
},
"credentialProvider":{
"providers":[
null,
null,
null,
null
],
"resolveCallbacks":[
]
},
"region":"us-south",
"logger":{
"_events":{
},
"_eventsCount":0,
"_level":20,
"streams":[
{
"type":"stream",
"stream":{
"connecting":false,
"_hadError":false,
"_parent":null,
"_host":null,
"_readableState":{
"objectMode":false,
"highWaterMark":16384,
"buffer":{
"head":null,
"tail":null,
"length":0
},
"length":0,
"pipes":[
],
"flowing":null,
"ended":false,
"endEmitted":false,
"reading":false,
"constructed":true,
"sync":true,
"needReadable":false,
"emittedReadable":false,
"readableListening":false,
"resumeScheduled":false,
"errorEmitted":false,
"emitClose":false,
"autoDestroy":true,
"destroyed":false,
"errored":null,
"closed":false,
"closeEmitted":false,
"defaultEncoding":"utf8",
"awaitDrainWriters":null,
"multiAwaitDrain":false,
"readingMore":false,
"dataEmitted":false,
"decoder":null,
"encoding":null,
"readable":false
},
"_events":{
},
"_eventsCount":1,
"_writableState":{
"objectMode":false,
"highWaterMark":16384,
"finalCalled":false,
"needDrain":false,
"ending":false,
"ended":false,
"finished":false,
"destroyed":false,
"decodeStrings":false,
"defaultEncoding":"utf8",
"length":0,
"writing":false,
"corked":0,
"sync":true,
"bufferProcessing":false,
"writecb":null,
"writelen":0,
"afterWriteTickInfo":null,
"buffered":[
],
"bufferedIndex":0,
"allBuffers":true,
"allNoop":true,
"pendingcb":0,
"constructed":true,
"prefinished":false,
"errorEmitted":false,
"emitClose":false,
"autoDestroy":true,
"errored":null,
"closed":false,
"closeEmitted":false
},
"allowHalfOpen":false,
"_sockname":null,
"_pendingData":null,
"_pendingEncoding":"",
"server":null,
"_server":null,
"columns":189,
"rows":37,
"_type":"tty",
"fd":1,
"_isStdio":true
},
"closeOnExit":false,
"level":20,
"raw":false
}
],
"serializers":{
},
"src":false,
"fields":{
"name":"s3",
"hostname":"osboxes",
"pid":2721
},
"haveNonRawStreams":true
},
"apiVersions":{
},
"apiVersion":null,
"endpoint":"https://s3.us.cloud-object-storage.appdomain.cloud",
"httpOptions":{
"timeout":60000
},
"maxRedirects":10,
"paramValidation":true,
"sslEnabled":true,
"s3ForcePathStyle":false,
"s3BucketEndpoint":false,
"s3DisableBodySigning":true,
"computeChecksums":true,
"convertResponseTypes":true,
"correctClockSkew":false,
"customUserAgent":null,
"dynamoDbCrc32":true,
"systemClockOffset":0,
"signatureVersion":"v4",
"signatureCache":true,
"retryDelayOptions":{
"base":100
},
"useAccelerateEndpoint":false,
"publicEndpoint":"https://s3.us.cloud-object-storage.appdomain.cloud",
"serviceInstanceId":"<snip>",
"accessKeyId":"<snip>",
"secretAccessKey":"<snip>",
"params":{
"Bucket":"ibmq-resultsdb-experiments-staging"
}
},
"endpoint":{
"protocol":"https:",
"host":"s3.us.cloud-object-storage.appdomain.cloud",
"port":443,
"hostname":"s3.us.cloud-object-storage.appdomain.cloud",
"pathname":"/",
"path":"/",
"href":"https://s3.us.cloud-object-storage.appdomain.cloud/"
},
"_events":{
"apiCallAttempt":[
null
],
"apiCall":[
null
]
},
"_clientId":1
}
Based on that it doesn't look like maxRetries
is being set. I guess the default config is coming from this then:
https://github.com/IBM/ibm-cos-sdk-js/blob/1.10.0/lib/config.js#L491-L492
We do override the timeout in config like this:
s3: { // https://ibm.github.io/ibm-cos-sdk-js/AWS/S3.html#constructor-property
endpoint,
publicEndpoint,
serviceInstanceId: process.env.S3_SERVICE_INSTANCE_ID,
accessKeyId: process.env.S3_ACCESS_KEY_ID,
secretAccessKey: process.env.S3_SECRET_ACCESS_KEY,
region: 'us-south',
params: {
// bind requests to a specific bucket in the service instance
Bucket: process.env.BUCKET_NAME || 'ibmq-resultsdb-experiments-staging'
},
httpOptions: {
// The upstream default is 2 minutes but the Cloudflare 524 timeout is 100 seconds
// so we want a default below those values but not so low that it will timeout when
// listing 1000 objects out of the bucket.
timeout: parseInt(process.env.S3_HTTP_TIMEOUT_MS, 10) || 60000
}
}
Regarding maxRetries
the default for putObject
isn't quite clear to me though from reading this:
I guess we'll just maxRetries=3
in our config to be sure it's set and we know what the value is.
Hello, Every request is retried in case of any timeout errors that you specified above. You can also configure how many times it needs to retry by configuring the maxErrorRetry parameter along with other parameters that you pass while creating the client. Coming to your second question on how to know whether retry is happening, you can get it by configuring logger parameter when creating the client.
PutObject also honors the maxRetries option. Along with that if we want to whether retry is happening or not, you can configure logger parameter to get the log of retries.
config =
endpoint: ""
accessKeyId: ""
secretAccessKey: ""
maxRetries=5
logger:console
I hope this clarifies your query. If your issue is resolved, please close the ticket.
Below, you can see the 4 different ways of creating the client and their output. without maxRetries and to hit a connection timeout error, i used a fake endpoint along with logger parameter.
config =
endpoint: "10.0.0.0" // this is fake
accessKeyId: ""
secretAccessKey: ""
logger:console
Output:
[AWS s3 undefined 4.486s 3 retries] putObject({ Bucket: '<Bucket name i used>', Key: 'foo', Body: 'bar' })
config =
endpoint: "10.0.0.0" // this is fake
accessKeyId: ""
secretAccessKey: ""
maxRetries:5
logger:console
Output:
[AWS s3 undefined 6.983s 5 retries] putObject({ Bucket: '
3. without maxRetries and correct endpoint with logger enabled
config = endpoint: "" // I provided correct endpoint here accessKeyId: "" secretAccessKey: "" logger:console
Output: [AWS s3 200 0.098s 0 retries] putObject({ Bucket: 'cos-presignedUrl--571550', Key: 'foo', Body: 'bar' })
4. with maxRetries and correct endpoint with logger enabled
config = endpoint: "" // I provided correct endpoint here accessKeyId: "" secretAccessKey: "" maxRetries:5 logger:console
Output: [AWS s3 200 0.1s 0 retries] putObject({ Bucket: 'cos-presignedUrl--571550', Key: 'foo', Body: 'bar' })
We're using ibm-cos-sdk-js 1.10.0 and we sometimes hit a 60 second TimeoutError here https://github.com/IBM/ibm-cos-sdk-js/blob/1.10.0/lib/http/node.js#L86 when doing a putObject operation. I'm assuming this is a network hiccup and we should just retry. While looking at the code I noticed this handeRequestWithRetries utility function which looks perfect https://github.com/IBM/ibm-cos-sdk-js/blob/1.10.0/lib/util.js#L899 but that only seems to be used for metadata service and IAM token requests https://github.com/IBM/ibm-cos-sdk-js/search?q=handleRequestWithRetries - is there some reason it wouldn't also be used when calling putObject? Or is it and I'm just not following where these things are connected?
https://docs.aws.amazon.com/general/latest/gr/api-retries.html seems to suggest everything is retried but I'm not sure about that given what I'm seeing above.
I'm wondering if a retry is happening, how would I know since there is nothing logged that I can see. It seems like this would be a good place to log something about doing a retry: https://github.com/IBM/ibm-cos-sdk-js/blob/1.10.0/lib/util.js#L911
There is an example stacktrace: