Azure / azure-cosmos-dotnet-v3

.NET SDK for Azure Cosmos DB for the core SQL API
MIT License
739 stars 493 forks source link

Is there any way of knowing programatically the number of physicals partitions or the Throughput in each partition? #1214

Closed AntonioJDios closed 4 years ago

AntonioJDios commented 4 years ago

I am trying to understand how get the best performance. However, I noticed that each time our collection increase we can get new physical partitions. This means the througput for each partition will be decrease because we have more physical partitions. Correct me if I'm wrong. :).

So, in order to see if I can do something programatically I would like to be able to know the number of partitions or well the throughput assigned to each partition. I know how to do using azure portal...

thanks.

AntonioJDios commented 4 years ago

In previous sdk I saw we have something like getPartitionKeyRange.... do we have something similar in the v3? thanks

j82w commented 4 years ago

v3 doesn't have partition key ranges. v2 exposed partition key ranges for some performance scenarios, but it caused a lot of issues. Most users didn't properly handle splits which would causes a lot of exceptions in user's production environments.

@kirankumarkolli any suggestion for v3?

AntonioJDios commented 4 years ago

Ok, I'm designing some code to adjust the number of parallel task based in the throughput of each partition and the statistics for me it would be great if I can have this throughput :)

j82w commented 4 years ago

We do realize there is a need to parallelize work. There is a PR that enables the ability to do it for changefeed.

@kirankumarkolli is there any way to get the RUs per partition?

AntonioJDios commented 4 years ago

Thanks, as I can see in this PR

For monitoring purposes, it is also now possible to obtain which are the Partition Key Ranges that a particular FeedToken represents, through the Container.GetPartitionKeyRangesAsync method, that takes a FeedToken as parameter:

FeedTokenIterator iterator = container.GetChangeFeedStreamIterator(someInitialFeedToken);while (iterator.HasMoreResults) { // Stream iterator returns a response with status code using(ResponseMessage response = await iterator.ReadNextAsync()) { if(response.IsSuccessStatusCode) { // Consume response.Content stream IEnumerable partitionKeyRanges = await container.GetPartitionKeyRangesAsync(iterator.FeedToken); } } }

So I would need a feedToken to get the ranges for that feedToken. My main problem is when inserting. And I would like to know when I'm going to insert which is the throughput assigned to each partition. Do you have something that I am missing in the PR?

thanks.

El lun., 17 feb. 2020 a las 15:54, j82w (notifications@github.com) escribió:

We do realize there is a need to parallelize work. There is a PR https://github.com/Azure/azure-cosmos-dotnet-v3/pull/1210 that enables the ability to do it for changefeed.

@kirankumarkolli https://github.com/kirankumarkolli is there any way to get the RUs per partition?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Azure/azure-cosmos-dotnet-v3/issues/1214?email_source=notifications&email_token=AHAX7HY6JVIGRAXVWAZC3KLRDKQI7A5CNFSM4KV5KETKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEL6WDEA#issuecomment-587030928, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHAX7H6FBDZC6ZKSR57PQMLRDKQI7ANCNFSM4KV5KETA .

-- ---------------------------------------------------------------------------------------------------------------------- Antonio J. Dios García Martín Doctor en Ingeniería Informática.

ealsur commented 4 years ago

RU/s should be equal in all physical partitions. Reference: https://docs.microsoft.com/en-us/azure/cosmos-db/set-throughput

The throughput provisioned for a container is evenly distributed among its physical partitions, and assuming a good partition key that distributes the logical partitions evenly among the physical partitions, the throughput is also distributed evenly across all the logical partitions of the container. You cannot selectively specify the throughput for logical partitions. Because one or more logical partitions of a container are hosted by a physical partition, the physical partitions belong exclusively to the container and support the throughput provisioned on the container.

AntonioJDios commented 4 years ago

Hi @ealsur yes, I know that. And this is exactly the number I want. I'm doing some experiments which looks promising. But I need to set manually to my algorithm the throuhgput assigned to each physical partition gotten from azure portal. I'd like to be able to take that number automatically well divinding the total throughput between the number of physical partition or directly. But I do know how to take the throughput assigned neither the number of partitions.

thanks.

RU/s should be equal in all physical partitions. Reference: https://docs.microsoft.com/en-us/azure/cosmos-db/set-throughput

The throughput provisioned for a container is evenly distributed among its physical partitions, and assuming a good partition key that distributes the logical partitions evenly among the physical partitions, the throughput is also distributed evenly across all the logical partitions of the container. You cannot selectively specify the throughput for logical partitions. Because one or more logical partitions of a container are hosted by a physical partition, the physical partitions belong exclusively to the container and support the throughput provisioned on the container.

abhijitpai commented 4 years ago

@AntonioJDios Physical partitions are an implementation detail and their count can change at arbitrary points of time; do not depend on this on your code. If you tell us what you are trying to achieve, we can provide some alternatives.

AntonioJDios commented 4 years ago

I want to calculate the maximun number of tasks that can be executed inserting data in a second. Given the throughput allocated to a partition (I'm only inserting in just one partition in a moment), the time that cost a single upsert and the RU for that operation. Calculate the number of tasks in parallel that I can run in this second.

abhijitpai commented 4 years ago

If your scenario is the same as what you mentioned in https://github.com/Azure/azure-cosmos-dotnet-v3/issues/1209 "The complete situation is we receive a message from a service bus and we generate this set of documents. Each messsage will be saved into a different parttition key."

Maybe your goal is to ensure a time bound between the time a message is generated (approximate this to when the message is received from the service bus) to the time it is available for consumption (i.e. saved to Cosmos DB). You need to make sure the rate of processing of messages can match the rate of production, else the lag can continuously increase. If you get up to m messages per second, and each message generates one item that takes up to k RUs to insert, assuming the item partition key is chosen as per https://docs.microsoft.com/en-us/azure/cosmos-db/partitioning-overview#choose-partitionkey, provision the container for m k 1.2 (taking 20% buffer as example since distribution will not be exactly uniform). Also keep the retry policy configured on the SDK as is that will allow a few retries to happen as a fallback for unexpected spikes in messages. In terms of processing the request, say if you are using a QueueClient as per https://docs.microsoft.com/en-us/azure/service-bus-messaging/service-bus-dotnet-get-started-with-queues - just create the item from the message and do await container.CreateItemAsync() on it within the message handler callback. Ensure you have a singleton client on your app as per https://docs.microsoft.com/en-us/azure/cosmos-db/performance-tips#sdk-usage. Now, if the machines doing this request is powerful in terms of CPU, memory and network, and if each machine is writing tens of thousands of documents per second to a container that has hundreds of thousands of RUs, then initializing the client using https://docs.microsoft.com/en-us/dotnet/api/microsoft.azure.cosmos.fluent.cosmosclientbuilder.withbulkexecution?view=azure-dotnet can help improve throughput - everything else can remain same as above.

AntonioJDios commented 4 years ago

Ok, thank you for all your help. Finally we have decide to increase the througput while we have messages and when the queue es empty we will decrease the throughput. Also we we'll try to desing our system in order to be able to do a bulk with items that belong to different partition keys. I think that should help us. isnt it?

Just one question regarding the cost. The price that we'll play will be throughput during a complete hour which we have set some time the big throughput. right? for instance I put the Th at 10000 at 8:13 AM and it finish at 9:23 . I would pay two hours at that amount? correct me if I am wrong.

thanks guys!

If your scenario is the same as what you mentioned in #1209 "The complete situation is we receive a message from a service bus and we generate this set of documents. Each messsage will be saved into a different parttition key."

Maybe your goal is to ensure a time bound between the time a message is generated (approximate this to when the message is received from the service bus) to the time it is available for consumption (i.e. saved to Cosmos DB). You need to make sure the rate of processing of messages can match the rate of production, else the lag can continuously increase. If you get up to m messages per second, and each message generates one item that takes up to k RUs to insert, assuming the item partition key is chosen as per https://docs.microsoft.com/en-us/azure/cosmos-db/partitioning-overview#choose-partitionkey, provision the container for m k 1.2 (taking 20% buffer as example since distribution will not be exactly uniform). Also keep the retry policy configured on the SDK as is that will allow a few retries to happen as a fallback for unexpected spikes in messages. In terms of processing the request, say if you are using a QueueClient as per https://docs.microsoft.com/en-us/azure/service-bus-messaging/service-bus-dotnet-get-started-with-queues

El mié., 19 feb. 2020 a las 7:28, abhijitpai (notifications@github.com) escribió:

If your scenario is the same as what you mentioned in #1209 https://github.com/Azure/azure-cosmos-dotnet-v3/issues/1209 "The complete situation is we receive a message from a service bus and we generate this set of documents. Each messsage will be saved into a different parttition key."

Maybe your goal is to ensure a time bound between the time a message is generated (approximate this to when the message is received from the service bus) to the time it is available for consumption (i.e. saved to Cosmos DB). You need to make sure the rate of processing of messages can match the rate of production, else the lag can continuously increase. If you get up to m messages per second, and each message generates one item that takes up to k RUs to insert, assuming the item partition key is chosen as per https://docs.microsoft.com/en-us/azure/cosmos-db/partitioning-overview#choose-partitionkey, provision the container for m k 1.2 (taking 20% buffer as example since distribution will not be exactly uniform). Also keep the retry policy configured on the SDK as is that will allow a few retries to happen as a fallback for unexpected spikes in messages. In terms of processing the request, say if you are using a QueueClient as per https://docs.microsoft.com/en-us/azure/service-bus-messaging/service-bus-dotnet-get-started-with-queues

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Azure/azure-cosmos-dotnet-v3/issues/1214?email_source=notifications&email_token=AHAX7H5DPIMH62MU6BEXYQTRDTGRPA5CNFSM4KV5KETKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMGQZ4Q#issuecomment-588057842, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHAX7H75Q55YO6ANP3GJOGLRDTGRPANCNFSM4KV5KETA .

-- ---------------------------------------------------------------------------------------------------------------------- Antonio J. Dios García Martín Doctor en Ingeniería Informática.

abhijitpai commented 4 years ago

@AntonioJDios Please look at https://docs.microsoft.com/en-us/azure/cosmos-db/understand-your-bill#billing-rate-when-throughput-on-a-container-or-database-scales-updown on the billing question.

You can also look at using https://docs.microsoft.com/en-us/azure/cosmos-db/provision-throughput-autopilot if it meets your requirements.