Open zlepper opened 2 years ago
Tagging subscribers to this area: @dotnet/area-system-io See info in area-owners.md if you want to be subscribed.
Author: | zlepper |
---|---|
Assignees: | - |
Labels: | `api-suggestion`, `area-System.IO` |
Milestone: | - |
I like this idea, ASP.NET actually has a bunch of streams implemented like this (logging, the file buffering read stream you mentioned). The scenario you mentioned where the request body is being uploaded to blob storage and hashed using 2 algorithms is tricky because the TeeReader supports a read and write stream. I'm envisioning what that code would look like with this change:
app.MapPost("/blob", async Stream body) =>
{
Stream blobStream = GetTheBlobStream();
Stream hashStream = new DoubleHashStream();
Stream teeStream = new TeeStream(body, hashStream);
await body.CopyToAsync(teeStream, blobStream);
stream.Complete();
var sha1 = stream.Sha1Hash;
var md5 = stream.MD5Hash;
return new { sha1, md5 };
});
class DoubleHashStream : Stream
{
// etc
private HashAlgorithm[] _hashes = new [] {md5, sha1};
// ...
public override ValueTask WriteAsync(ReadOnlyMemory<byte> buffer, CancellationToken cancellationToken)
{
foreach (var hasher in _hashes)
hasher.TransformBlock(buffer, 0, bytesRead, null, 0);
}
public void Complete()
{
foreach (var hasher in _hashes)
hasher.TransformFinalBlock(Array.Empty<byte>(), 0, 0);
}
}
However "flipping" a stream around is possible with System.IO.pipelines, even though that is somewhat painful (Or at least it was last I did it)
There should be a simpler way to get a duplex stream from a Pipe.
I'm not a fan of the name TeeReader or TeeStream though.
@stephentoub and @adamsitnik is this part of your stream improvements list?
There should be a simpler way to get a duplex stream from a Pipe. @stephentoub and @adamsitnik is this part of your stream improvements list?
Yup
Background and motivation
A common use case I encounter when needing to deal with uploaded files is passing them forward to something like Azure Blob Storage and hash them for checksumming. Another related use case is hashing a file with multiple hash algorithms without loading the file more than once from the disk.
Right now, I have two options, as far as I know:
How we do multiple file hashing right now
```csharp using var md5 = MD5.Create(); using var sha1 = SHA1.Create(); var hashes = new ListAPI Proposal
Adapted from TeeReader from GO: https://pkg.go.dev/io#TeeReader + Binary reader from C#
I'm not entirely sure about the
leaveOpen
flag, but thought it might help with consistency with signature like the constructors ofBinaryReader
.In theory this could very easily be expanded also to support writing to multiple streams at the same, such as this
EchoStream
from here: https://www.codeproject.com/Articles/3922/EchoStream-An-Echo-Tee-Stream-for-NET, however I feel that would require more guarantees from the supplied streams rather than justreadStream.CanRead == true && writeStream.CanWrite == true
API Usage
Another example if you need to provide a readable stream to something, and need to "flip" is using
Pipe
:Alternative Designs
Microknights has a project called
SplitStream
, which achieves somewhat the same effect: https://github.com/microknights/SplitStreamHowever, this requires juggling parallel tasks rather than just a couple of streams. The benefit of this solution is that it lets you get multiple useable read streams, rather than having to pass a write stream. However "flipping" a stream around is possible with
System.IO.pipelines
, even though that is somewhat painful (Or at least it was last I did it). However that problem is outside the realm of this API suggestion.Risks
No risk for existing projects, as this is an entirely new class.