Azure / azure-cosmos-table-dotnet

.NET SDK for Azure Cosmos Table API
14 stars 6 forks source link

Memory leaks using client per request strategy #10

Closed rislanov closed 4 years ago

rislanov commented 5 years ago

Hello!

We've tried to create a new instance of CloudTableClient per each request to storage (AddTransient in IoC container, for example) and it caused a memory leak.

So, after only a 500 requests to storage, we have a disproportional memory consuption:

CloudStorageAccount storageAccount = CloudStorageAccount.Parse(connectionString);
CloudTableClient tableClient = storageAccount.CreateCloudTableClient();

CloudTable table = tableClient.GetTableReference(TableName);
await table.CreateIfNotExistsAsync();

var entity = new TableEntity(
    partitionKey: "default",
    rowKey: Guid.NewGuid().ToString()
);

var insertResult = await table.ExecuteAsync(
    TableOperation.Insert(entity)
);

Parallel.For(0, 500, (i) =>
{
    var _table = storageAccount.CreateCloudTableClient().GetTableReference(TableName);
    //var _table = tableClient.GetTableReference(TableName);

    var result = _table.ExecuteAsync(
        TableOperation.Retrieve(entity.PartitionKey, entity.RowKey)
    ).GetAwaiter().GetResult();
});

client_per_request_leaks

But if we will use a single instance of CloudTableClient, everything will be ok:

single_client_ok

The reason of this problem is that CloudTableClient doesn't call Dispose() for internal instance of Microsoft.Azure.Documents.Client.DocumentClient and there is no way to do it outside:

CloudTableClient_source

So it looks like the only one way to work with CloudTableClient is to use it as a singleton. But, in this case, it would be great to know exactly, that CloudTableClient is thread-safe :) Could you please confirm this?

donghexu commented 5 years ago

Hi @rislanov, Thanks for providing us such detailed info! It is always recommended to reuse CloudTableClient. CloudTableClient is thread-safe. Here is some tips: https://azure.microsoft.com/en-us/blog/performance-tips-for-azure-documentdb-part-1-2/. It is for DocumentClient and some of it are applied to CloudTableClient as well.

rislanov commented 5 years ago

Anyway, if we will use old SDK and CloudTableClient for Azure Table Storage, there will be no leaks with "client per request strategy":

old_client_ok

So it seems to me, it looks like a bug :)

donghexu commented 5 years ago

Hi @rislanov. Thanks again. So let's split the problem into 2 parts:

  1. In general, regarding CloudTableClient use pattern, we always recommend reusing CloudTableClient per domain, especially for Cosmos DB path. It can gain efficient connection management and performance. We don't recommend creating CloudTableClient for each new request, as it can create much overhead.
  2. As to the memory leak only, we will look into it in details. The info and repo steps you provided will be very helpful, thanks for that. :)
chrishawl commented 4 years ago

With regards to the guidance on reusing cloud table client across requests, how would you advise managing that in a scenario where you have to connect to multiple different storage accounts and the connection configuration changes per request? This a scenario we face using table storage.

sakash279 commented 4 years ago

@VIPHercules If you have multiple accounts, then you would need that many CloudTableClient instances. Each cloudtableinstance accepts a connection string/storage URI. However, within an account the need to create a cloud table client per request does not seem feasible, and will lead to memory leaks and other overhead issues described here.

sakash279 commented 4 years ago

closing as all questions have been answered on the issue. Please re-open if needed.

dzmitry-lahoda commented 3 years ago

Is still holds to create singleton per domain for the latest Cosmos Table API client?