Azure / azure-cosmos-dotnet-v2

Contains samples and utilities relating to the Azure Cosmos DB .NET SDK
MIT License
577 stars 837 forks source link

Query with expression breaks long number precision #293

Open gongdo opened 7 years ago

gongdo commented 7 years ago

I'm using Microsoft.Azure.DocumentDB.Core 1.3.2 as a .NET Core client.

I have a document in the DocumentDb Collection like this:

{
  "id": "1",
  "Int32": -169086359,
  "Int64": 636341866924868914
}

And I want to get the document as this type:

public class Foo
{
  public string Id { get; set; }
  public int Int32 { get; set; }
  public long Int64 { get; set; }
}

When I query documents without sql expression, it returns the document as I expected. However, when I query documents with sql expression - both sqlExpression and Linq query, it breaks long number's precision. I tried like this:

var client = GetMyClient();
var documentId = "1";
var uri = UriFactory.CreateDocumentCollectionUri("myDatabase", "myCollection");
           var r1 = client
                .CreateDocumentQuery<Foo>(uri)
                .AsEnumerable()
                .FirstOrDefault();
            var r2 = (await client
                    .CreateDocumentQuery(uri, new FeedOptions
                    {
                        EnableCrossPartitionQuery = true
                    })
                    .AsDocumentQuery()
                    .ExecuteNextAsync<Foo>())
                    .FirstOrDefault();
            var r3 = (await client
                    .CreateDocumentQuery(uri, new FeedOptions
                    {
                        EnableCrossPartitionQuery = true
                    })
                    .Where(d => d.Id == documentId)
                    .AsDocumentQuery()
                    .ExecuteNextAsync<Foo>())
                    .FirstOrDefault();
            var r4 = (await client
                    .CreateDocumentQuery(
                        uri, 
                        $"SELECT * FROM s WHERE s.id = '{documentId}'",
                        new FeedOptions
                        {
                            EnableCrossPartitionQuery = true
                        })
                    .AsDocumentQuery()
                    .ExecuteNextAsync<Foo>())
                    .FirstOrDefault();
            var r5 = (await client
                    .CreateDocumentQuery(
                        uri,
                        $"SELECT * FROM s",
                        new FeedOptions
                        {
                            EnableCrossPartitionQuery = true
                        })
                    .AsDocumentQuery()
                    .ExecuteNextAsync<Foo>())
                    .FirstOrDefault();

And results are:

//r1: 636341866924868914    // plain query, sync
//r2: 636341866924868914    // plain query, async, allow cross-partition
//r3: 636341866924868864    // Linq Where expression, async, allow cross-partition
                   //^^^ watch here!
//r4: 636341866924868864    // sqlExpression with WHERE claue, async, allow cross-partition
//r5: 636341866924868864    // sqlExpression without WHERE claue, async, allow cross-partition

The query processor seems have a glitch for dealing with long number conversion, only when query with any Linq expression or direct sql expression. sync, async and cross-partition option are not cause of this issue.

Even worse thing is, if I receive the long number property with object type, then the client treats the long number as double type.

public class Foo
{
  public string Id { get; set; }
  public int Int32 { get; set; }
  public object Int64 { get; set; }    // <-- sometimes I need to do this, e.g.) Dictionary<string, object>
}

// and do the same code:

//r1: 636341866924868914
//r2: 636341866924868914
//r3: 6.363418669248E+17
//r3: 6.363418669248E+17
//r3: 6.363418669248E+17

Notice, there's no issue when query without any query expressions. That's why I believe this is a bug in the client processing.

It looks like https://github.com/Azure/azure-documentdb-dotnet/issues/151 issue, but I'm not sure why that is happened. And there's no workaround here.

chriswill commented 7 years ago

Is there any status on this issue, or #288?

I'm not sure if you're perceiving this issue in the same way that I do, but doesn't this issue mean that DocumentDb (or at least this client) is fundamentally broken until this is fixed? I have data that I can't store in DocumentDb, because when I go to retrieve it I get a long value for an ID that has no meaning to any other record. The data may be "correct" in the cloud store, but by the time it's retrieved via query that's not the case. The data is meaningless.

If this was my product, my reaction would be "OMFG" and I would be doing whatever I could to resolve it. Is there some workaround I'm not seeing, other than "don't store long values"?

kirankumarkolli commented 7 years ago

@arramac could you please response to this thread.

arramac commented 7 years ago

@chriswill, sorry about the trouble you've been having. The DocumentDB type system supports only IEEE754 numbers (which are floating point). main reason we haven't introduced Int64 is that there is no high-fidelity way to support it with the Json/JavaScript query type system (details here https://docs.microsoft.com/en-us/azure/cosmos-db/documentdb-sql-query-reference)

However, we understand that richer types like Int64 (and DateTime, Guid, etc.) are necessary for a wide range of applications. We support these types in other access APIs of Cosmos DB, and are planning to introduce them in the DocumentDB API as an extended-JSON wire format and serializer/de-serializer.

Meanwhile, please note the lower precision is only for query processing, not in storage. For query processing, as a workaround you can store the numbers as strings or as hex strings (for range comparison).

If you have more questions, please reach out to us at AskCosmosDB@microsoft.com, and we can discuss over a call.