[MCS-DE-GO] Optimize usage of CosmosDB

dr-dolittle commented 4 years ago

When using CosmosDB as storage layer different types of documents are being created per bot (e.g. users or conversations). Would it be possible to optimize the way they are stored for more efficient postprocessing?

Here are some suggestions:

Rethink partition strategy - currently there is on document per partition. This seems to be inefficient on the one side regarding the usage of partitions but also on the consumption side, as partitions are a great way to optimize CosmosDB usage (or not).
Rethink naming of documents - the names of the documents are not predictable and therefore point reads are not possible. Being able to use point reads would greatly improve costs and performance.
Add an explicit property that describes the type of the document (e.g. "user" or "conversation"). Currently it is necessary to extract that from the id what is highly inefficient.

Thanks

@goergenj @benbrown @sgellock

mdrichardson commented 4 years ago

I've got some thoughts specific to your points at the bottom. However, Cosmos partitioning is commonly misunderstood, so here's a quick primer to get everybody on the same page:

Key Term Clarification

For Bot Framework purposes, “container” is basically synonymous with “collection”. A container can store a single collection, graph, or table. Bot Framework uses a Cosmos container that stores a collection.

“Document” refers to an item stored in Cosmos as a whole, and not the “document” property of a state object.

Partition Keys will be referred to as PK and Partition Values as PV, to more clearly distinguish from other “keys” and “values”. If we have a document:

{ 
  "id": 1, 
  "data": "xyz", 
  "myKey": "myValue" 
}

“myKey” is the PK and “myValue” is the PV. In Cosmos Data Explorer, it would look like:

In the right-hand data explorer document viewer:

1 { 2 "id": 1. 3 "data": "xyz", 4 "myKey": "myValue", 5 }

and in the left-hand side document selector

id	/myKey
1	myValue

Cosmos Partitioning with Partition Keys

Here’s a basic layout of a Cosmos DB:

A database can have multiple containers, but generally, Bot Framework users will only use one container per bot.

Each container has multiple physical partitions. The benefit of multiple physical partitions is that the data is split up so that when querying, you have a smaller subset of data to look over and don’t have to search the entire container. CosmosDB dynamically adjusts the number of physical partitions per container based on throughput and storage needs. The user has absolutely no control over this, and it all happens on the backend.

Logical Partitions consist of a set of documents with the same PVs and there is a Logical Partition for every unique PV. Since every document that has a PV of “myValue” will always reside within the same Logical Partition and also because Logical Partitions cannot be distributed across multiple Physical Partitions, every document with the same PV will reside on the same Physical Partition.

Cosmos DB automatically distributes data across Physical Partitions based on PVs, so setting the right PK is important to ensure that documents get distributed such that each Physical Partition uses roughly the same throughput and storage.

In addition to making sure a Physical Partition doesn’t become a hotspot due to read/writes/storage, the right PK ensures that queries happen across Physical Partitions as rarely as possible. It’s important to think of PKs as providing load balancing benefits and cross-partition avoidance, whereas partitioning, in general, is what provides indexing and throughput benefits.

Note: The number of Logical Partitions (which are created by unique PVs) does not affect performance. Only the balance across Physical Partitions does.

Default Partition Keys and the Bot Framework

You cannot write more than one document at a time. Currently, the Bot Framework doesn’t query more than one document at a time. Therefore, any single storage operation that the Bot Framework makes is not cross-partition (with the current Bot Framework Storage implementation).

So, the only PK that really matters to the Bot Framework, currently, is one that distributes the documents evenly—we don’t have to worry about cross-partition queries because we don’t query more than one document at a time.

The default PK of "id" currently makes the most sense since we always know it and they’re basically 1 “id” : 1 Document, so they will distribute very evenly.

I know that conceptually, it doesn’t seem to make sense. But again, we only query 1 document at a time, so there is no reason to try to lump documents together under the same PV. Trying to use anything else just introduces additional code (“id” already exists on every document and doesn’t need to be modified) or lumps documents into Logical Partitions that may create hotspots.

References

Rethink partition strategy - currently there is one document per partition. This seems to be inefficient on the one side regarding the usage of partitions but also on the consumption side, as partitions are a great way to optimize CosmosDB usage (or not).

As explained above, since we only query one document at a time, one document per partition is optimal for current Bot Framework State Storage implementations.

I could possibly see some benefit to partitioning by user and when any query is made for the user, we pull all of their conversations (or most recent) in the same query. I'm not sure how that would work on the Bot State side...maybe instead of calling UserState.SaveChanges() and ConversationState.SaveChanges(), we allow a BotState.SaveChanges() that acts on both at the same time or something.

Rethink naming of documents - the names of the documents are not predictable and therefore point reads are not possible. Being able to use point reads would greatly improve costs and performance.

Add an explicit property that describes the type of the document (e.g. "user" or "conversation"). Currently it is necessary to extract that from the id what is highly inefficient.

Agreed on both. This would have backwards-compat issues that would need to be worked around.

dr-dolittle commented 4 years ago

@mdrichardson thanks for the elaboration.

From my perspective there are different scenarios: on the side those where the Bot itself interacts with CosmosDB and in this scenario it may be sufficient or even the best approach to go with 1 document per logical partition. On the other side (and this is my current focus) there may be consumers (beside the Bot) that want to do something with the data stored in CosmosDB (clean up, analytics and so on). And for these scenarios the current approach (partitioning, id naming, lack of classifying property) causes some issues regarding performance and cost.

As you also agreed on the naming and classification property are something that could be valuable in both scenarios. Do you plan to do something in that regard?

For the partitioning part it really depends on the usage scenarios and which priority each scenario has. First and foremost there is the Bot and that the Bot is working well (in terms of performance, latency, cost…). So probably we can agree that this has the highest priority. The question then is if there are relevant scenarios from Bot perspective that would require (given the goal of optimal resource usage) adjustments (as you already pointed out). Then there are the other scenarios like clean up ,analytics and so on that definitely need a different approach.

What I can think of is leveraging something like the Change Feed to create different collections with different partitioning (and by doing so also use a synthetic partition key instead of an already existing property). This way different scenarios could be covered (on the cost of overhead).

Another interesting approach for doing analytics is using the integration with Azure Synapse.

What do you think?

mdrichardson commented 4 years ago

@dr-dolittle I should have probably prefaced my first reply with a statement that I'm not really in a position to make decisions on your recommendations. I've just done a lot of research on Cosmos partitioning and wanted to make sure that anybody who is involved in decisions related to this is on the same page. As far as Cosmos goes, partitioning is the only topic I know deeply enough to be comfortable commenting on.

That being said, I think id makes the most sense for a default partitionKey. For other developers who may want to enact some clean up, analytics, etc scenarios, we do offer the ability to set a custom partitionKey (at least in the SDK. I'm not sure about Composer). Although, even with the ability to set a custom partitionKey, it isn't particularly easy to set it to anything other than id for Bot State documents.

@cwhitten I should clarify that my comments on this shouldn't be taken as me assigning myself this issue. Just providing additional context.

sgellock commented 4 years ago

Thanks @mdrichardson

@carlosscastro @johnataylor can you weigh in please

cwhitten commented 3 years ago

ping @johnataylor @carlosscastro @sgellock

sgellock commented 3 years ago

tagged it to R12 so we can ensure it gets addressed

carlosscastro commented 3 years ago

Moving to re-assess priority in R13.

compulim commented 3 years ago

R13 planning should start next week. DRI, please look at this one next week.

carlosscastro commented 3 years ago

This item didn't make it during R13 because of the short release and //BUILD focus on other user stories. we'll re-prioritize in R14 planning.

carlosscastro commented 3 years ago

@johnataylor FYI in case you want to re-prioritize. AS things stand right now, we cannot resource this in R14 so I'll move to next mileston.

microsoft / botframework-sdk