Azure / azure-cosmos-table-dotnet

.NET SDK for Azure Cosmos Table API
14 stars 6 forks source link

Performance degradation comparing to Microsoft.WindowsAzure.Storage when accessing Azure Table storage #43

Open maxal1917 opened 4 years ago

maxal1917 commented 4 years ago

We upgraded to the next version of SDK to access our Azure Table storage.

We observed performance degradation of our application after that. We even created test applications with identical usage pattern to isolate it, and still see this performance hit.

We are using .NET Framework code, reading data from Azure table.

Old client: Microsoft.WindowsAzure.Storage - 9.3.2

New client: Microsoft.Azure.Cosmos.Table - 1.0.6

Here is one of the sample tests we tried to run:

public async Task ComparisionTest1()
{
    var partitionKey = CompanyId.ToString();

    {
        // Microsoft.Azure.Cosmos.Table
        var storageAccount = Microsoft.Azure.Cosmos.Table.CloudStorageAccount.Parse(ConnectionString);
        var tableClient = Microsoft.Azure.Cosmos.Table.CloudStorageAccountExtensions.CreateCloudTableClient(storageAccount);
        var tableRef = tableClient.GetTableReference("UserStatuses");
        var query = new Microsoft.Azure.Cosmos.Table.TableQuery<Microsoft.Azure.Cosmos.Table.TableEntity>()
                            .Where(Microsoft.Azure.Cosmos.Table.TableQuery.GenerateFilterCondition("PartitionKey", "eq", partitionKey));
        var result = new List<Microsoft.Azure.Cosmos.Table.TableEntity>(20000);

        var stopwatch = Stopwatch.StartNew();
        var tableQuerySegment = await tableRef.ExecuteQuerySegmentedAsync(query, null);
        result.AddRange(tableQuerySegment.Results);
        while (tableQuerySegment.ContinuationToken != null)
        {
            tableQuerySegment = await tableRef.ExecuteQuerySegmentedAsync(query, tableQuerySegment.ContinuationToken);
            result.AddRange(tableQuerySegment.Results);
        }

        stopwatch.Stop();
        Trace.WriteLine($"Cosmos table client. Elapsed: {stopwatch.Elapsed}");
    }

    {
        // Microsoft.WindowsAzure.Storage
        var storageAccount = Microsoft.WindowsAzure.Storage.CloudStorageAccount.Parse(ConnectionString);
        var tableClient = storageAccount.CreateCloudTableClient();
        var tableRef = tableClient.GetTableReference("UserStatuses");
        var query = new Microsoft.WindowsAzure.Storage.Table.TableQuery<Microsoft.WindowsAzure.Storage.Table.TableEntity>()
                            .Where(Microsoft.WindowsAzure.Storage.Table.TableQuery.GenerateFilterCondition("PartitionKey", "eq", partitionKey));
        var result = new List<Microsoft.WindowsAzure.Storage.Table.TableEntity>(20000);

        var stopwatch = Stopwatch.StartNew();
        var tableQuerySegment = await tableRef.ExecuteQuerySegmentedAsync(query, null);
        result.AddRange(tableQuerySegment.Results);
        while (tableQuerySegment.ContinuationToken != null)
        {
            tableQuerySegment = await tableRef.ExecuteQuerySegmentedAsync(query, tableQuerySegment.ContinuationToken);
            result.AddRange(tableQuerySegment.Results);
        }

        stopwatch.Stop();
        Trace.WriteLine($"Old table client. Elapsed: {stopwatch.Elapsed}");
    }
}

When test is ran in Azure environment, here is result:

image

Any thoughts, advise about it?

PaulCheng commented 4 years ago

The performance issue will be resolved in Table SDK 1.0.7 as verified with large entity. On 1.0.6, the workaround is to disable Table sdk trace by adding diagnostics section in app.config if it's a .NET framework app. It will still be slower than Storage sdk, but much better than without the workaround depending on the usage.

PaulCheng commented 4 years ago

source name="Microsoft.Azure.Cosmos.Table" switchName="ClientSwitch" switchType="System.Diagnostics.SourceSwitch"

In switches seciontion, add name="ClientSwitch" value="Off"

ghost commented 4 years ago

We also hit this performance issue in version 1.0.6 after updating from 1.0.1 in our azure function apps. We saw performance 3-10 times slower in release and up to 70 times slower in in debug. Performance degradation was also non-linear - slower the more entities were being read - 70 times slower was query returning 1700 entities. If we had not had performance up to 70 times slower we probably would not have noticed this as readily. Even when we did it took us some time to narrow the issue down.

Given the potential to significantly affect customer applications I think more attention needs to be drawn to the need to use 1.0.7 or use the system.diagnostics workaround. For example there is no blog post about this. The release notes don't even mentioned 1.0.6. There is no warning on the Nuget package. etc.