robinrodricks / FluentStorage

A polycloud .NET cloud storage abstraction layer. Provides Blob storage (AWS S3, GCP, FTP, SFTP, Azure Blob/File/Event Hub/Data Lake) and Messaging (AWS SQS, Azure Queue/ServiceBus). Supports .NET 5+ and .NET Standard 2.0+. Pure C#.
MIT License
291 stars 40 forks source link

MD5 hash calls are failing with dotnet 8, linux environment and multiple threads #72

Open jonasof opened 2 months ago

jonasof commented 2 months ago

Hi, I'm using the in-memory version for unit tests in a linux container with dotnet 8, I'm getting this exception while writing files:

System.Security.Cryptography.CryptographicException : Concurrent operations from multiple threads on this type are not supported.
Stack Trace:
   at System.Security.Cryptography.ConcurrencyBlock.Enter(ConcurrencyBlock& block)
   at System.Security.Cryptography.HashProviderDispenser.EvpHashProvider.AppendHashData(ReadOnlySpan`1 data)
   at System.Security.Cryptography.HashProvider.AppendHashData(Byte[] data, Int32 offset, Int32 count)
   at System.Security.Cryptography.HashAlgorithm.ComputeHash(Byte[] buffer)
   at FluentStorage.Utils.Extensions.ByteArrayExtensions.MD5(Byte[] bytes)
   at FluentStorage.Blobs.InMemoryBlobStorage.Write(String fullPath, Stream sourceStream)
   at FluentStorage.Blobs.InMemoryBlobStorage.WriteAsync(String fullPath, Stream sourceStream, Boolean append, CancellationToken cancellationToken)

To simplify the reproduction I made this code:

using FluentStorage;
using FluentStorage.Utils.Extensions;

var memoryStorage = StorageFactory.Blobs.InMemory();

var firstFile = new string('*', 500000).ToMemoryStream();
var secondFile = new string('x', 500000).ToMemoryStream();

var task1 = Task.Run(() => memoryStorage.WriteAsync("/firstFile", firstFile));
var task2 = Task.Run(() => memoryStorage.WriteAsync("/secondFile", secondFile));

await task1;
await task2;

By reading some threads like https://github.com/dotnet/runtime/issues/93205#issuecomment-2226282536 and https://github.com/dotnet/runtime/issues/93205, It seems that reusing the crypto object at FluentStorage.Blobs.InMemoryBlobStorage.Write function is not safe for multi threads as from dotnet 8.

private static readonly Crypto.MD5 _md5 = Crypto.MD5.Create();

public static byte[]? MD5(this byte[]? bytes) {
    if (bytes == null)
        return null;

    return _md5.ComputeHash(bytes);
}

Some alternative I see to solve the issue are:

1 - Instantiate the crypto at every call, like Crypto.SHA256.Create().ComputeHash(bytes) instead of _md5.ComputeHash(bytes). I've tried it and it solved the issue; However i'm not sure about the performance cost. 2 - Use Crypto.MD5.HashData(bytes). It's only available after dotnet 7 and I haven't tested it.

Thank you for the support.

ronlv4 commented 2 months ago

Hi @jonasof, I had the same problem for SHA256, option 2 you mentioned solved it for me.

ansongoldade commented 2 months ago

I was able to resolve this issue by using the static HashData method rather than calling ComputeHash on an instance. The static methods are guaranteed to be thread safe.

return SHA256.HashData(bytes);