Azure / azure-storage-net

Microsoft Azure Storage Libraries for .NET
Apache License 2.0
446 stars 370 forks source link

BlobWriteStream.Dispose() causes thread starvation under load #754

Open DaRosenberg opened 6 years ago

DaRosenberg commented 6 years ago

Which service(blob, file, queue, table) does this issue concern?

Blob

Which version of the SDK was used?

9.3.0

Which platform are you using? (ex: .NET Core 2.1)

.NET Core 2.1

What problem was encountered?

I've been troubleshooting a scalability issue with our ASP.NET Core Web API, which makes heavy use of blob storage for underlying storage.

We observed:

Long story short, through profiling I have finally pinpointed the issue to be here:

https://github.com/Azure/azure-storage-net/blob/38425e715e1bcdb4cab344bcb9b448c08bf8af5c/Lib/WindowsRuntime/Blob/BlobWriteStream.cs#L200-L220

The problem is the CommitAsync().Wait(). Since the dispose pattern in .NET is not async, this stream implementation of course has to block on the commit operation, and the commit is necessary to ensure that the written blocks are committed to blob storage. This blocks the thread for the duration of that commit operation, which under load leads to thread starvation.

Have you found a mitigation/solution?

One solution would be for our service to do await stream.CommitAsync(); before disposing the stream, but unfortunately BlobWriteStream is marked as internal so this type is not visible to our code. The only solution I can think of is to make BlobWriteStream public, allowing client code to asynchronously commit the stream before disposing it.

sfmskywalker commented 6 years ago

The proposed IAsyncDisposable feature would be a perfect fix for this. https://github.com/dotnet/csharplang/issues/43

zezha-msft commented 6 years ago

Hi @DaRosenberg, thanks for reaching out!

I've logged this issue for further investigation.

DaRosenberg commented 6 years ago

@zezha-msft I have now completed testing to confirm that this did indeed account for a significant bottleneck, and that changing it to commit asynchronously nearly doubled the scalability of our service.

After that, other bottlenecks came into play that we proceeded to address. Through a combination of a few changes we made in our fork, we were able to achieve perfect scalability. I will be creating separate issues shortly for the other changes we had to make.

davidfowl commented 6 years ago

Usually, the way you work around async dispose is to call FlushAsync (assuming FlushAsync also causes the commit in this case).

DaRosenberg commented 6 years ago

@davidfowl Unfortunately FlushAsync() does not commit the blocks, so that’s not an option in this case. Nor should it I think. They are semantically very different operations - flushing can be performed repeatedly while writing, whereas committing is a very final operation after which no more writes can occur.

davidfowl commented 6 years ago

I see CommitAsYnc is exposed directly. Is it an option to call it? Or is some other component writing that you have no control over?

DaRosenberg commented 6 years ago

@davidfowl The class is internal. Making it public is what I suggested in my OP - this would allow us to CommitAsync() before disposing. This is how we fixed the scalability issue in our fork, but obviously we would rather not be on a forked code base. :)

davidfowl commented 6 years ago

Ah hah! Yes that's bad 😄

asorrin-msft commented 6 years ago

I think we have a solution for this. CloudBlockBlob.OpenWriteAsync() should return a CloudBlobStream, which has a public CommitAsync() method - does that not work? If it does, the issue becomes one of discoverability :) It would probably be good to have OpenWriteAsync() explicitly call out this issue in the comments. I'm not sure how much that will help, though - people are probably fairly used to writing

using (Stream stream = await blob.OpenWriteAsync()) { //... }

it's entirely plausible to me that the comment would be missed. Any suggestions?

DaRosenberg commented 6 years ago

@asorrin-msft You are right - that should work without any code change actually, because BlobWriteStreaminherits from CloudBlobStream. So from client code we should be able to do something like:

using (var stream = await blob.OpenWriteAsync())
{
  // Do some writing to the stream here...

  if (stream is CloudBlobStream blobStream)
    await blobStream.CommitAsync();
}

The issue of discoverability is something I've also been thinking about. An XML doc comment on ´OpenWriteAsync()` will not go very far I'm afraid.

Might one consider taking this so far as to actually throw from Dispose() if the stream has not been committed, thus always forcing the pattern of committing (asynchronously) before disposing? At least that way, it will be a fail-fast and obvious thing very early in the dev lifecycle, instead of a very elusive and hard-to-diagnose scalability problem 18 months down the road when the application is already in production...

klesta490 commented 5 years ago

Any plans to solve this with DisposeAsync?

justinSelf commented 5 years ago

@DaRosenberg

A word of caution with the using blocks around the CloudBlobStream. If there's an exception before you commit, you will end up committing partial writes or an empty stream. This would then overwrite your exist blob with either incomplete or 0 bytes of data.

Ask me how I know :(

DaRosenberg commented 5 years ago

@justinSelf I'm a bit perplexed as to how that could happen. Are you saying commits happen even before we even call CloudBlobStream.CommitAsync()?

justinSelf commented 5 years ago

@DaRosenberg Here's a code sample to show what I'm talking about. If you open a CloudBlobStream inside of a using, and do anything that could cause an exception, you risk losing data if the blob already has data in it.

 class Program
    {
        static async Task Main(string[] args)
        {
            CloudStorageAccount storageAccount = CloudStorageAccount.Parse("connection string");

            var blobClient = storageAccount.CreateCloudBlobClient();
            var blobContainer = blobClient.GetContainerReference("test-container");
            await blobContainer.CreateIfNotExistsAsync();

            var blob = blobContainer.GetBlockBlobReference("test-blob");

            await blob.UploadTextAsync("test");

            //inspect blob - should say "test"
            Console.ReadKey();

            using (var stream = await blob.OpenWriteAsync())
            {
                throw new Exception();
            }

            //inspect blob - will be empty
        }
    }

I'm only resurrecting this thread because of searching for this issue.

justinSelf commented 5 years ago

@DaRosenberg Also, to directly answer your question, Dispose, as you know, will also call Commit. So, if there's an exception inside the using block, when the stream gets disposed, it will also commit.

DaRosenberg commented 5 years ago

@justinSelf Duh, of course, stupid of me. The CommitAsync() happening as part of Dispose() was the whole reason why I opened this issue in the first place. 🤣