Azure / azure-cosmos-dotnet-v2

Contains samples and utilities relating to the Azure Cosmos DB .NET SDK
MIT License
579 stars 839 forks source link

Azure CosmosDB upgrade causes “Distict query requires a matching order by in order to return a continuation token” error #674

Open beachard opened 5 years ago

beachard commented 5 years ago

We are running a production app on Azure CosmosDB for at least a year and this morning on Febuary 6th 2019 arround 10 AM (Eastern time -5) our application started to return HTTP 500 errors for no reason.

After exhaustive investigation, we found out that the error is coming from the Azure CosmosDB SQL .NET API ver 2.2.2 Library returning the following error :

Distict query requires a matching order by in order to return a continuation token

No i did not do a typo with the word "Distict" instead of "Distinct"; Microsoft did ! It is a cut and paste from the error.

Here is our query :

 SELECT DISTINCT VALUE c.city_sm 
 FROM c  
 WHERE c.partition = 'aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee'
 AND c.dataview_id = 'aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee'   
 AND c.dataview_ver = 260   
 AND c.type = 'city_sm'  
 AND c.culture = 'fr'

On the .NET side we are using the following Client.FeedOptions and code :

Client.FeedOptions.MaxItemCount = 100

' Query the list of items
oQuery = BuildSearchSqlQuerySpec(Of ItemType)(Account, DataView, SearchParams)
oDDBResult = DDClient.CreateDocumentQuery(Of ItemType)(
Client.UriFactory.CreateDocumentCollectionUri(Core.Config.PublishDB.Database, Core.Config.PublishDB.Collection), 
                    oQuery, 
                    oOptions).AsDocumentQuery()

oResponse = oDDBResult.ExecuteNextAsync(Of ItemType).Result

And after the query we try to get the Continuation Token like this :

If (oDDBResult.HasMoreResults) Then
    oResult.next_token = oResponse.ResponseContinuation
End If

This is when the error occure and it seams to be returned by the SQL Library, not the AZURE server since i do not have any error on Azure.

I tried adding ORDER BY c.city_sm to my query to fix the issue. I had no more errors but the query was returning no records probably because i did not have a range index on the c.city_sm field. This field is a subtree of many other properties so i guess it cannot be ordered by ?? So to be quick and fix the production app down issue, i simply removed the DISTINCT keyword and it worked.

How can they change something on the AZURE production server that breaks existing code ? I don't know how to reach out Microsoft and i am wonderring if anybody else experienced issues on that Febuary 6th date or any other date if they deploy new version in a time range ?

beachard commented 5 years ago

I did more investigation with Fiddler to find out what happens under the hood and i found out that the error is throwned by the .NET Client (v2.2.1 and v2.2.2) but probably because of a server behavior change since the same code use to work for 6 month with the same client library version and one day it stopped working without us doing anything.

Here are my findings :

When i fire the ExecuteNextAsync function, the call is made to CosmosDB and i am getting all the necessary data from the collection in the response and response HEADERS are even containing the x-ms-continuation header with a value. This means that my query worked and the server returned a continuation token.

When i debug my code and step into it until i try to read the ResponseContinuation property of the result variable; this is when the exception "Distict query requires a matching order by in order to return a continuation token" is throwned and no calls are made to the CosmosDB server. So it is clear it is an internal error in the library.

I noticed that when calling the ExecuteNextAsync method for the first time i see in Fiddler that 2 calls are made to the CosmosDB server where the only differemnce is that the second call contains a x-ms-continuation: header with a value while the first one is empty. But both call return the complete JSON data with all my 100 items and the same continuation token :

Call #1 Request : image Response : image

Call #2 Request : image Response : image

There is got to be a change in the server behavior that caused the .NET Client ver 2.2.1 to start throwing those errors this week after six month running without this problem. The error says that a DISTINCT query must contain an ORDER by clause matching the SELECT but the query without the ORDER BY is working perfectly in the Azure Data Explorer and according the the response i see in Fiddler, it is also working fine !

For those who are wondering; yes i tried adding the ORDER BY but since c.linsting_sm is a complex object i get no error but i have 0 items in my result. And if i try to use ORDER BY c.listing_sm.id for example, i get back the same error because the ORDER do not match the SELECT.

Thanks for looking into this !

tompostler commented 5 years ago

Just noticed this morning that this is affecting me as well on the Microsoft.Azure.DocumentDB package version 2.1.3.

kaiza commented 5 years ago

This is also affecting us on both versions 2.1.3 and 2.2.2 of the package.

kirankumarkolli commented 5 years ago

@bchong95 can you please look into this?

vyarymovych commented 5 years ago

Are there any updates on it? This is very very strange! I have existing project that works great with version="2.1.3". I created new project, installed package of the same version and got the same error when running identical query against the same database!

bchong95 commented 5 years ago

Cosmos DB recently introduced DISTINCT to the query grammar. Latest versions of the .net SDK have the right logic to consolidate query results for distinct queries. In the first phase of the rollout of the feature, older versions of the SDK inadvertently allowed DISTINCT queries in gateway mode. While it is safe to rely on the older SDK behavior for some queries (e.g. when the results fit within a page), it may result in unexpected behavior when the distinct values need to be summarized across query pages. In order to clarify this behavior and prevent accidental misuse of this feature. This explains why you saw a change in behavior without any code change.

As for the error message we are unable to serve a DISTINCT query without a matching order through continuation tokens due to the amount of state that needs to be serialized. In the future we are allowing objects and arrays to be emitted in the ORDER BY, so this should unblock your use case.

Sincere apologies for the inconvenience.

tompostler commented 5 years ago

@bchong95, is there any kind of estimate for this?

In the future we are allowing objects and arrays to be emitted in the ORDER BY, so this should unblock your use case.

I'd like to know if I should revisit this often with anticipation, or forget about it until I find my //TODO linking back here 😉

swangers commented 4 years ago

Same issue my query is super simple "Select Distinct c.memberId from c" this fails saying Distinct queries require order by clause with continuation token. This "Select Distinct c.memberId order by c.memberId" also fails. Using .net Core 3.1 with new Cosmos SDK 3.8.

aliquid commented 4 years ago

In the future we are allowing objects and arrays to be emitted in the ORDER BY, so this should unblock your use case.

@bchong95 - any estimate on this?

Nikolay-Uvarov commented 4 years ago

We have the same issue in the powerapps. We are using COSMOS DB Query documents V3(V2) action in the power automate(flow).

cfindlayvcs commented 3 years ago

@bchong95 why does the property ContinuationToken throw an exception?

Why not just return null?

bchong95 commented 3 years ago

Because null is the same as "no more results". We need to keep the signals different.

On Wed, Dec 2, 2020, 2:35 PM cfindlayvcs notifications@github.com wrote:

@bchong95 https://github.com/bchong95 why does the property ContinuationToken throw an exception?

Why not just return null?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Azure/azure-cosmos-dotnet-v2/issues/674#issuecomment-737537426, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABBNUNA63HRIHZ7FKFW3NUTSS26MXANCNFSM4GVRDUAA .

mentorfloat commented 2 years ago

Any updates on this? Whenever I add DISTINCT in the query, it'll still throw out an error.

SELECT DISTINCT LEFT(c.Start, INDEX_OF(c.Start, " ")) AS Date FROM c
[api] [2022-08-28T02:17:27.773Z] System.Private.CoreLib: Exception while executing function: GetDatesOnly. Microsoft.Azure.WebJobs.Host: Exception binding parameter 'data'. Microsoft.Azure.DocumentDB.Core: Distict query requires a matching order by in order to return a continuation token.
[api] [2022-08-28T02:17:27.773Z] If you would like to serve this query through continuation tokens, then please rewrite the query in the form 'SELECT DISTINCT VALUE c.blah FROM c ORDER BY c.blah' and please make sure that there is a range index on 'c.blah'.