Azure / azure-cosmos-dotnet-v3

.NET SDK for Azure Cosmos DB for the core SQL API
MIT License
735 stars 491 forks source link

Small ResponseContinuationTokenLimitInKb leads to query execution failed with 400 status code. #3791

Closed blankor1 closed 1 year ago

blankor1 commented 1 year ago

Describe the bug Seeing throughput this doc: https://learn.microsoft.com/en-us/dotnet/api/microsoft.azure.documents.client.feedoptions.responsecontinuationtokenlimitinkb?view=azure-dotnet#remarks

It says this ResponseContinuationTokenLimitInKb can be >= 0. So I set it to 0 and the request will always failed with an exception: "Errors":["The continuation token limit specified is not large enough to serialize the required attributes into the continuation token. Please provide a higher limit."]

This behavior kind of make sense since we can't compress a token to not exist. But maybe the document could point this out to avoid misleading.

But my question is: Will this exception also happens when I set this value to 1 but the actual token string length is above 10,000? I heard that the max size could grow to 16Kb. I can't repro a 10,000 token locally but such token does exist in our PROD env. Is it safe to set this limit to 1 or higher value like 4 and expect no failure on this token size limit reason?

And by the way, from your aspect, is it saft to pass this continuationToken directly (without mapping or encryption) to end user? Like through this token string, will they know that we are using Cosmos DB as backend database, and do something harmful with this token?

Looking forward to hearing from you. Thanks!

To Reproduce Excute a query and set the QueryRequestOption.ResponseContinuationTokenLimitInKb = 0;

Expected behavior Query success.

Actual behavior Failed with 400 exception: "Errors":["The continuation token limit specified is not large enough to serialize the required attributes into the continuation token. Please provide a higher limit."]

Environment summary SDK Version: tested on 3.31.0 and latest version. same behavior OS Version: windows 11

ealsur commented 1 year ago

@blankor1 thanks for the feedback, @jcocchi / @neildsh can we make the documentation around this property a bit more clear?

Regarding the expectations though, @blankor1 could you explain why do you expect a query with 0 allowed continuation to succeed? How would the query communicate the continuation for the next pages?

blankor1 commented 1 year ago

@ealsur Thank you! It's just that the documentation says that this value can be greater than or equal to 0 and I want to test if I can make this limit smaller than 1Kb. If 0 will always fail the query execution, maybe we can consider exclude zero from the doc description.

I have this question because we have very large continuation token size in our PROD env, and now we want to apply this size limit in our library since we would like to pass this value to frontend for pagination and token size too long may fail the request. So there is two things I really want to make sure:

1. Will this length too small exception also happens when I set this value to 1 but the actual token string length is above 10,000 or higher? 2. Is it saft to pass this continuationToken directly (without mapping or encryption) to end user? Like through this token string, will they know that we are using Cosmos DB as backend database, and do something harmful with this token?

Do you have any suggestion or best practice about how we should apply Continuation token and this ResponseContinuationTokenLimitInKb?

ealsur commented 1 year ago

Thanks for providing context. I will leave the recommendations to the other folks on the thread. My only comment is regarding this:

I want to test if I can make this limit smaller than 1Kb

If the name of the property is ResponseContinuationTokenLimitInKb then you won't be able to make it smaller than 1Kb, because (assuming 0 is invalid value), there is no other value between 1 and 0 you can use (the property is an int).

neildsh commented 1 year ago

Hi @blankor1 the continuation token is used to serialize data that is then used for resuming the query. If you reduce the size of the continuation token then all the work that we did to calculate this data such as the filter posting set from the index etc. will have to be redone on each subsequent round trip. This will make your queries slower and more expensive. In general, we do not recommend setting this property but just rely on the default instead.

blankor1 commented 1 year ago

@neildsh Thanks for your explanation. Let me clarify, For us, some extra cost and perf penalty are acceptable. But it would be unacceptable for us if the query will fail because the token is too long but we demand the token to be limited to a smaller size (non-0). So we just want to make sure whether this would happen.

neildsh commented 1 year ago

My apologies for the late response @blankor1. This is correct. Smaller values should not cause failures, but we really would recommend against setting small values here. I would recommend running tests with workloads that are representative of your production environment, and logging the size of the continuation tokens generated. You can use that to calibrate the value of this setting.

For very expensive queries (for example ORDER BY, where the backend has to scan a large number of index pages), small values of the continuation token size can lead to your query hitting edge cases where you are spending a lot of time on extra round trips.