Open DoCode opened 6 years ago
+1 for something like this.
it would be nice to use an action, similar to whats already happening with the foreach methods in the IUnifiedData, something like this...
using (var outputStream = new MemoryStream())
{
hash = _hash.ComputeHash(inputStream, outputStream.Write);
}
I'm considering that it might make sense to do something like:
IHashValue ComputeHash(Stream inputStream, Stream outputStream, CancellationToken cancellationToken);
IHashValue ComputeHashAsync(Stream inputStream, Stream outputStream, CancellationToken cancellationToken);
Yep, perfect. I tried out the action method with the xxhash function that I've been using and managed to make it work but having input / output stream would make more sense.
I had a look through your work WIP, any reason why you didn't add an output stream to the byte array methods?
I made a fork and implemented it on those methods to do some testing with. I can make a PR if you're interested. https://github.com/netclectic/Data.HashFunction/commit/c16e7794d719a55c804a1f3369299043f59c2253
I recognize its been over a year, but I'm now taking another look at this.
Have a stream of some unknown (possibly large size), for instance from the network or file system. With that stream you want to a) calculate the hash value of the data and b) doing some other processing in the same sized chunk of data, all without reading more than necessary into memory or re-buffering the data.
Have a stream of some unknown (possibly large size), for instance from the network or file system. With that stream you want to a) calculate the hash value of the data and b) stream that data to some other endpoint, all without reading more than necessary into memory or re-buffering the data.
Being a year later, I'm not sure if I actually like my idea of having input/output streams. I think that from a usability standpoint it is awkward and error prone -- streams do not behave strictly like pipes or buffers, they only have a single read/write head and therefore having something simultaneously reading and writing to a stream doesn't make sense.
In the input/output streams case, we solve for the "Write + calculate hash value" use case, but we do not effectively solve for the "Read + calculate hash value" use case.
I think a better path would be to have underlying support for the type of TransformBlock / FinalizeBlock API which can be used by end consumers, while maintaining our current ComputeHash functionality as well.
Since I will be doing #46 as well as a v3.0, I plan on punting this change to that milestone and making this change dependent on that issue.
Provide support for stream block processing, like a default .NET HashAlgorithm: