Open Mortana89 opened 5 years ago
This best practice must be new. It definately wasn't the recommendation before. I've personally seen scenarios where setting this to -1 can cause some serious performance issues.. I am willing to change that but if you change the pageSize to -1 but I will also need to benchmark the behavior. Can you give me a hand by branching Cosmonaut and testing it with the -1 pageSize?
I've done some tests. It's difficult to say and surely depends upon scenario:
MaxItemCount | 100 | Items | 1665 | |
---|---|---|---|---|
Call | Time (ms) | Items | RU | |
1 | 331 | 100 | 14,37 | |
2 | 260 | 100 | 14,78 | |
3 | 529 | 100 | 12,61 | |
4 | 270 | 100 | 13,85 | |
5 | 211 | 100 | 12,08 | |
6 | 182 | 100 | 12,13 | |
7 | 182 | 100 | 12,05 | |
8 | 250 | 100 | 11,93 | |
9 | 197 | 100 | 11,76 | |
10 | 459 | 100 | 12,29 | |
11 | 259 | 100 | 12,47 | |
12 | 192 | 100 | 12,46 | |
13 | 314 | 100 | 12,18 | |
14 | 207 | 100 | 12,52 | |
15 | 257 | 100 | 12,48 | |
16 | 257 | 100 | 12,26 | |
17 | 324 | 65 | 8,94 | |
4681 | 1665 | 211,16 | ||
Spread out RU (Total RU / time in seconds) | 45,11002 | |||
MaxItemCount | -1 | |||
1 | 694 | 1665 | 167,48 |
The advantage of using fixed bucket size is of course the spread in RU consumption. When using the maxitemcount of -1, you'll get bigger RU spikes which could also cause performance issues if the RU spike is higher than the available RU on the collection, which causes throttling etc etc...
An advantage of using the second approach, is the network call time adds up when a user would request many records in a listpage. But in our case it would take more than 300 items before it gets interesting to collect more data at once.
PS; those are local development timings ;)
Thanks for this @Mortana89.
When using the maxitemcount of -1, you'll get bigger RU spikes which could also cause performance issues if the RU spike is lower than the available RU on the collection, which causes throttling etc etc
This is the reason why I disabled that in the first place. I could remove the -1 limitation to allow for this usecase but I am not sure if that's the right approach to this. I will re-wire the page extensions to use the added Skip and Take support in the SDK so I think that this whole issue will probably go away.
I will put this on hold until I settle on the approach.
Currently Cosmonaut does not allow fetching data with pagination without defining an initial page size (as first request). This is against a recommendation we received from Cosmos DB:
You are using the query page size of 100 for queries for your Azure Cosmos DB container. We recommend using a page size of -1 for faster scans
Upon following the following article: https://docs.microsoft.com/nl-nl/azure/cosmos-db/sql-api-query-metrics#max-item-count
It's clear that it's better to specify -1 as our max_item_count for the first fetch, so Cosmos DB can determine what's the max response set. Further down the road we can then use continuation token and follow normal path.
However, the SDK currently does not support specifying -1 due to the following lines: https://github.com/Elfocrash/Cosmonaut/blob/2619335d2a94271158ee52dcaef55bce002978b6/src/Cosmonaut/Extensions/PaginationExtensions.cs#L26
https://github.com/Elfocrash/Cosmonaut/blob/2619335d2a94271158ee52dcaef55bce002978b6/src/Cosmonaut/Extensions/PaginationExtensions.cs#L45
Can these be changed to also support -1 as page size, as such cosmos DB can determine the max size?