Azure / Azurite

A lightweight server clone of Azure Storage that simulates most of the commands supported by it with minimal dependencies
MIT License
1.8k stars 320 forks source link

[Table Storage] Azurite filters not matching production filters for null values #2342

Open pizerg opened 8 months ago

pizerg commented 8 months ago

Which service(blob, file, queue, table) does this issue concern?

Table

Where do you get Azurite? (npm, DockerHub, NuGet, Visual Studio Code Extension)

Visual Studio Professional 2022 (v 17.8.4)

What problem was encountered?

When trying to query for properties with null values or without them the query is different when using azurite and when using the real azure storage.

The following will filter out null values in azure storage but not in azurite [PROPERTY_NAME] lt ''

While the following will filter out null values in azurite but not in azure storage [PROPERTY_NAME] gt ''

Steps to reproduce the issue?

Create a table, add two entities with a few properties, leave one property as NULL in one of the entities and execute a query using that property with the filters described above.

blueww commented 8 months ago

@pizerg

Thanks for raising this issue!

Would you please give the detail repro steps, better with Azurite debug log? Like how to create such an entity with null properties.

pizerg commented 8 months ago

@pizerg

Thanks for raising this issue!

Would you please give the detail repro steps, better with Azurite debug log? Like how to create such an entity with null properties.

@blueww

You can use the following entity (C#), just set Foo to null in some of the entities you insert for the test and query for it using Foo lt '' and Foo gt '' to see the different results obtained in Azurite vs Azure Table Storage:

public class TestEntity : ITableEntity
{
    public string PartitionKey { get; set; }
    public string RowKey { get; set; }
    public ETag ETag { get; set; }
    public DateTimeOffset? Timestamp { get; set; }

    public double? Foo { get; set; }
    public double? Bar { get; set; }
}
blueww commented 8 months ago

@pizerg

I tried to repro your issue, but I can't repro it. In my test, both Azurite and product Azure storage server will filter out null values with foo gt ''. And foo lt '' not work for both Azurite and product Azure storage server, the result is always empty with this filter.

Following is the network trace of my testing with real product Azure storage server. You can see foo gt '' works, which is different from your description.

list all table entities, you can see 2 of them has foo as null

GET https://accountname.table.core.windows.net/testtable?$top=20&$select=RowKey%2Cfoo%2Cfoo2%2Cuserid%2CPartitionKey%2CTimestamp HTTP/1.1
Accept-Charset: UTF-8
MaxDataServiceVersion: 3.0;NetFx
Accept: application/json; odata=minimalmetadata
DataServiceVersion: 3.0;
x-ms-client-request-id: 3efc1636-2d7b-47ea-87b4-f7fd43035809
User-Agent: Azure-Cosmos-Table/1.0.8 (.NET CLR 4.0.30319.42000; Win32NT 10.0.22631.0)
x-ms-version: 2017-07-29
x-ms-date: Thu, 18 Jan 2024 08:00:25 GMT
Authorization: SharedKey accountname:[hidden]
Host: accountname.table.core.windows.net

HTTP/1.1 200 OK
Cache-Control: no-cache
Transfer-Encoding: chunked
Content-Type: application/json;odata=minimalmetadata;streaming=true;charset=utf-8
Server: Windows-Azure-Table/1.0 Microsoft-HTTPAPI/2.0
x-ms-request-id: f1da65bd-4002-0068-1be4-49bc52000000
x-ms-version: 2017-07-29
X-Content-Type-Options: nosniff
Date: Thu, 18 Jan 2024 08:00:59 GMT

2C4
{"odata.metadata":"https://accountname.table.core.windows.net/$metadata#testtable&$select=RowKey,foo,foo2,userid,PartitionKey,Timestamp","value":[{"odata.etag":"W/\"datetime'2024-01-18T07%3A42%3A04.5721829Z'\"","PartitionKey":"partition1","RowKey":"1","Timestamp":"2024-01-18T07:42:04.5721829Z","foo":null,"foo2":null,"userid":null},{"odata.etag":"W/\"datetime'2024-01-18T07%3A41%3A15.9849281Z'\"","PartitionKey":"partition1","RowKey":"2","Timestamp":"2024-01-18T07:41:15.9849281Z","foo":"TestVal","foo2":null,"userid":null},{"odata.etag":"W/\"datetime'2024-01-18T07%3A49%3A52.3712301Z'\"","PartitionKey":"partition1","RowKey":"3","Timestamp":"2024-01-18T07:49:52.3712301Z","foo":null,"foo2":null,"userid":null}]}
0

list with filter foo gt '' , the entity with foo not equal to null will return.


------------------------------------------------------------------
GET https://accountname.table.core.windows.net/testtable?$filter=foo%20gt%20%27%27&$top=20&$select=RowKey%2Cfoo%2Cfoo2%2Cuserid%2CPartitionKey%2CTimestamp HTTP/1.1
Accept-Charset: UTF-8
MaxDataServiceVersion: 3.0;NetFx
Accept: application/json; odata=minimalmetadata
DataServiceVersion: 3.0;
x-ms-client-request-id: d750ac7d-1086-4dfe-9d35-4bbae9c79d60
User-Agent: Azure-Cosmos-Table/1.0.8 (.NET CLR 4.0.30319.42000; Win32NT 10.0.22631.0)
x-ms-version: 2017-07-29
x-ms-date: Thu, 18 Jan 2024 08:00:29 GMT
Authorization: SharedKey accountname:[hidden]
Host: accountname.table.core.windows.net

HTTP/1.1 200 OK
Cache-Control: no-cache
Transfer-Encoding: chunked
Content-Type: application/json;odata=minimalmetadata;streaming=true;charset=utf-8
Server: Windows-Azure-Table/1.0 Microsoft-HTTPAPI/2.0
x-ms-request-id: f1da6644-4002-0068-0ae4-49bc52000000
x-ms-version: 2017-07-29
X-Content-Type-Options: nosniff
Date: Thu, 18 Jan 2024 08:01:03 GMT

14E
{"odata.metadata":"https://accountname.table.core.windows.net/$metadata#testtable&$select=RowKey,foo,foo2,userid,PartitionKey,Timestamp","value":[{"odata.etag":"W/\"datetime'2024-01-18T07%3A41%3A15.9849281Z'\"","PartitionKey":"partition1","RowKey":"2","Timestamp":"2024-01-18T07:41:15.9849281Z","foo":"TestVal","foo2":null,"userid":null}]}
0

list with filter foo lt '' , no entity will return.


------------------------------------------------------------------
GET https://accountname.table.core.windows.net/testtable?$filter=foo%20lt%20%27%27&$top=20&$select=RowKey%2Cfoo%2Cfoo2%2Cuserid%2CPartitionKey%2CTimestamp HTTP/1.1
Accept-Charset: UTF-8
MaxDataServiceVersion: 3.0;NetFx
Accept: application/json; odata=minimalmetadata
DataServiceVersion: 3.0;
x-ms-client-request-id: b9d7bf88-45a4-49d0-9cb0-e3cc9c579940
User-Agent: Azure-Cosmos-Table/1.0.8 (.NET CLR 4.0.30319.42000; Win32NT 10.0.22631.0)
x-ms-version: 2017-07-29
x-ms-date: Thu, 18 Jan 2024 08:01:51 GMT
Authorization: SharedKey accountname:[hidden]
Host: accountname.table.core.windows.net

HTTP/1.1 200 OK
Cache-Control: no-cache
Transfer-Encoding: chunked
Content-Type: application/json;odata=minimalmetadata;streaming=true;charset=utf-8
Server: Windows-Azure-Table/1.0 Microsoft-HTTPAPI/2.0
x-ms-request-id: a2e45885-b002-0053-75e4-49f9f6000000
x-ms-version: 2017-07-29
X-Content-Type-Options: nosniff
Date: Thu, 18 Jan 2024 08:02:26 GMT

8F
{"odata.metadata":"https://accountname.table.core.windows.net/$metadata#testtable&$select=RowKey,foo,foo2,userid,PartitionKey,Timestamp","value":[]}
0
pizerg commented 8 months ago

@blueww

Thanks for your tests, it seems that with String data fields, it works as expected on both ends (azurite + azure product), however if you use Double or DateTime you'll see that the issue arises as described in my initial message presenting differences in azurite vs azure product.

blueww commented 8 months ago

@pizerg

I can repro this issue with Double or DateTime type entity.

It looks the reason is in JS (Azurite based on JS), a none null Double or DateTime value is taken as greater than ''. But on Azure server, from the query result it looks a none null Double or DateTime value is taken as less than ''.

I have tried foo ne '', it looks can filter out null value, and only return none null value on both Azure server and Azurite. Not sure if it can meet your requirement.

If we want to make Azurite works same as Table on the specific value query. We need change the compare code like following: https://github.com/Azure/Azurite/blob/d544d16f910e490fdd9db5565459df701895308f/src/table/persistence/QueryInterpreter/QueryNodes/GreaterThanNode.ts#L29 https://github.com/Azure/Azurite/blob/d544d16f910e490fdd9db5565459df701895308f/src/table/persistence/QueryInterpreter/QueryNodes/LessThanNode.ts#L29 We need add specific code for different data type and value to simulate Azure Server behavior, instead of just compare the values and return the compare result. It will make the code not easy to maintain and has regression risk. So if the workaround can unblock you, we might won't take this issue fix as high priority. Thanks for your understanding!

pizerg commented 8 months ago

@blueww

I've updated my queries to filter by ne '' and now I can get consistent results on both platforms, so for my use case this workaround is enough, thanks!

blueww commented 7 months ago

@pizerg

Thanks for your confirmation! Good to know the workaround works for you.