Closed wsugarman closed 1 year ago
I'm good with these changes.
Just so that I fully understand the implications, does this new method of compressing and decompressing remove the need for
SimpleBufferManager
or are we just removing it because it might be causing more problems than it solves? I guess the answer to that depends on whatblob.CopyToAsync
is doing internally? I don't mind either way, but curious to know if this changes the memory usage profile in any way.
I mostly removed it to simplify the logic, but it does change the allocations a bit. However, I think we're pretty well off at the end of the day. For example in CompressAndUploadAsBytesAsync
, previously we would allocate an array (as part of MemoryStream
) to store the entire compressed message blob while using a smaller pooled buffer to incrementally pass data from the source Stream
to the GZipStream
. However, now we stream the compressed bytes from the source Stream
directly to the GZipStream
that is wrapping a network stream (technically a BlockBlobWriteStream
). Internally. Inside of the MemoryStream.CopyToAsync
, we're really just calling GzipStream.WriteAsync
for the entire MemoryStream
buffer.
Likewise for DownloadAndDecompressAsBytesAsync
, previously we would allocate a large array via MemoryStream
to hold the contents of the blob and incrementally copy decompressed bytes from the GZipStream
wrapping the blob's stream to the destination MemoryStream
using a small pooled buffer. Then that array would be converted into a UTF-8 string
. However, now we skip the middle-man and directly use StreamReader.ReadToEndAsync
to write the bytes into a StringBuilder
and in turn a string
. Internally, this method does something similar to the logic before by incrementally filling up its own buffer (whose size can be specified in the StreamReader
ctor) and writing the bytes to the StringBuilder
. While StringBuilder
looks to be able to quickly create the string with an efficient memmove
, it still is copying the bytes.
So, in actuality, DownloadAndDecompressAsBytesAsync
is probably less memory efficient as it no longer pools the intermediate buffer -- although it may be slightly faster. But we do have big memory efficiency gains in CompressAndUploadAsBytesAsync
by streaming the entire operation.
If we really wanted to keep some of the pooling, I would recommend using Microsoft.IO.RecyclableMemoryStream
to replace the SimpleBufferManager
.
This PR merges with the
main
branch some changes that have been added since the last time. It also removes the existingSimpleBufferManager
and replaces it with a more standard call pattern.See these commits for more details around what is actually changed.