AzureCosmosDB / data-migration-desktop-tool

MIT License
129 stars 53 forks source link

Request rate is large. #83

Open vipinsorot opened 1 year ago

vipinsorot commented 1 year ago

Error-msg: Data transfer failed Microsoft.Azure.Cosmos.CosmosException : Response status code does not indicate success: TooManyRequests (429); Substatus: 3200; ActivityId: cfe3d2b9-8315-4ce4-9833-2ef94a6f7d82; Reason: ( code : TooManyRequests message : Message: {"Errors":["Request rate is large. More Request Units may be needed, so no changes were made. Please retry this request later. Learn more: http://aka.ms/cosmosdb-error-429"]}

Source: cosmos-nosql Sink: cosmos-nosql release:2.1.3

joelhulen commented 1 year ago

You can handle this in a couple of ways:

  1. Temporarily increase your Cosmos DB container (or database) scale settings by increasing the number of Request Units per second (RU/s) for the duration of your data transfer.
  2. Go to your Cosmos DB sink settings and increase the MaxRetryCount value (default 5). This will tell the Polly retry policy to wait and retry n times, based on the setting.

This is where the retry policy is defined: https://github.com/AzureCosmosDB/data-migration-desktop-tool/blob/f45805454bf824b163ee166f4982ac2994560447/Extensions/Cosmos/Cosmos.DataTransfer.CosmosExtension/CosmosDataSinkExtension.cs#L98

vipinsorot commented 1 year ago

@joelhulen yup i do have converted my collection configuration from manual to auto scale for RU/s can you plz share an example related to retry mechanism ?

vipinsorot commented 1 year ago

my migration file { "Source": "cosmos-nosql", "Sink": "cosmos-nosql", "SourceSettings": { "ConnectionString": "AccountEndpoint=*", "Database": "test", "Container": "Invoice", "PartitionKeyValue": "/data/dealClaimId", "Query": "SELECT FROM c" }, "SinkSettings": { "ConnectionString": "AccountEndpoint=*", "Database": "test", "Container": "Invoice", "BatchSize": 100, "MaxRetryCount": 5, "RecreateContainer": false, "ConnectionMode": "Gateway", "CreatedContainerMaxThroughput": 1000, "UseAutoscaleForCreatedContainer": true, "InitialRetryDurationMs": 200, "WriteMode": "InsertStream", "IsServerlessAccount": false, "PartitionKeyPath": "/data/dealClaimId" }, "Operations": [] } it's still failing with similar msg

joelhulen commented 1 year ago

@vipinsorot, the retry mechanism is already implemented. You just need to configure your MaxRetryCount setting in the config for the Cosmos DB extension, as documented here: https://github.com/AzureCosmosDB/data-migration-desktop-tool/tree/main/Extensions/Cosmos#sink

Try increasing that value to something higher, like 20. As for scale, simply switching from manual to auto-scale RU/s isn't necessarily enough to overcome rate limiting in high-volume loads. You might consider increasing the max RU/s in your auto-scale settings to a much higher number only while executing the tool.

joelhulen commented 1 year ago

I just saw your last message. Increase MaxRetryCount under SinkSettings to something like 20, in addition to my notes on a potential auto-scale increase.

vipinsorot commented 1 year ago

@joelhulen the job is existing with below msg image

vipinsorot commented 1 year ago

updated migration json: { "Source": "cosmos-nosql", "Sink": "cosmos-nosql", "SourceSettings": { "ConnectionString": "AccountEndpoint=", "Database": "INTEGRATION", "Container": "Invoice", "PartitionKeyValue": "/data/dealClaimId" }, "SinkSettings": { "ConnectionString": "AccountEndpoint=", "Database": "INTEGRATION", "Container": "Invoice", "BatchSize": 100, "MaxRetryCount": 40, "RecreateContainer": false, "ConnectionMode": "Gateway", "CreatedContainerMaxThroughput": 1000, "UseAutoscaleForCreatedContainer": true, "InitialRetryDurationMs": 200, "WriteMode": "InsertStream", "IsServerlessAccount": false, "PartitionKeyPath": "/data/dealClaimId" }, "Operations": [ { "SourceSettings": { "Query": "SELECT * FROM c" }, "SinkSettings": { "Container": "Invoice" } } ] }

joelhulen commented 1 year ago

Based on the message, I assume that no data was transferred, correct? If not, did you change any other settings besides MaxRetryCount?

vipinsorot commented 1 year ago

Indeed, no data has been successfully copied over. To tackle this, I've made an adjustment by increasing the maximum Request Units per second (RU/s) to 10,000 specifically for the designated collection

joelhulen commented 1 year ago

Are you able to run the application in debug mode to dig into why the data isn't copying over? From the log outputs you shared, it looks like there were no errors, per see, just that no data copied over. This could happen if it is unable to access the source data. I wonder if the max request (429) errors were coming from the source Cosmos DB database and not the destination one? Can you try scaling the source up before running the tool and see what happens?

vipinsorot commented 1 year ago

Yup i have already increased the RU/s to 10k for both source and sink

vipinsorot commented 1 year ago

https://github.com/AzureCosmosDB/data-migration-desktop-tool/blob/f45805454bf824b163ee166f4982ac2994560447/Extensions/Cosmos/Cosmos.DataTransfer.CosmosExtension/CosmosDataSourceExtension.cs#L36 call is updating feedIterator.HasMoreResults to false, has led to fetching a total of 0 records.

vipinsorot commented 1 year ago

@joelhulen it worked post refactoring the code image