OrchardCMS / OrchardCore

Orchard Core is an open-source modular and multi-tenant application framework built with ASP.NET Core, and a content management system (CMS) built on top of that framework.
https://orchardcore.net
BSD 3-Clause "New" or "Revised" License
7.41k stars 2.39k forks source link

Azure Blob Storage issue with 0 byte files #8715

Closed Skrypt closed 4 months ago

Skrypt commented 3 years ago

When using the blob storage with the Media feature. If we get any kind of network issue that doesn't complete downloading the file from the blob storage to the ms-cache folder it can cause a stack overflow.

You can repro the issue by displaying a list of content items with a pager. Spam the pager links and you will end up with Azure Blog Storage requests that got cancelled.

If a file doesn't resume to get downloaded completely it will create a file in the ms-cache folder with 0 byte. If you are using ImageSharp to alter this image then it will log that the image cannot be processed because it has not header. It can't find the file type to "resize" the image...

2021-03-01 22:00:07.8175|Default|00-cdafad81e240e349a63a1987f1e18cfd-8c343f3d632f9841-00||SixLabors.ImageSharp.Web.Middleware.ImageSharpMiddleware|ERROR|The image 'https://localhost:5001/media/mediafields/RealEstate/468g43r6mza0kyqmwces3967c3/029298494d49151ea5e9a690757e7081.jpg?width=600&token=LOFq1GbxbVu9kGg0iS7cevy9rIE0UBqSZGlgj0o6PD8%3D' could not be processed SixLabors.ImageSharp.UnknownImageFormatException: Image cannot be loaded. Available decoders:
 - BMP : BmpDecoder
 - GIF : GifDecoder
 - PNG : PngDecoder
 - TGA : TgaDecoder
 - JPEG : JpegDecoder

   at SixLabors.ImageSharp.Image.Load[TPixel](Configuration configuration, Stream stream, IImageFormat& format)
   at SixLabors.ImageSharp.Web.FormattedImage.Load(Configuration configuration, Stream source)
   at SixLabors.ImageSharp.Web.Middleware.ImageSharpMiddleware.<>c__DisplayClass18_0.<<ProcessRequestAsync>b__1>d.MoveNext()    at SixLabors.ImageSharp.Image.Load[TPixel](Configuration configuration, Stream stream, IImageFormat& format)
   at SixLabors.ImageSharp.Web.FormattedImage.Load(Configuration configuration, Stream source)
   at SixLabors.ImageSharp.Web.Middleware.ImageSharpMiddleware.<>c__DisplayClass18_0.<<ProcessRequestAsync>b__1>d.MoveNext()

/cc @deanmarcussen

JimBobSquarePants commented 3 years ago

I'm getting this issue since you guys updated ImageSharp

Without relevant version information I cannot help you. Also, please do not tag me in repositories I do not own.

Skrypt commented 3 years ago

I merged latest dev branch of OC on my website project. So I'm using the latest version of ImageSharp.

image

Skrypt commented 3 years ago

Oh sorry for the tagging. I think you and Dean merged a PR that fixed the LRU cache recently so I thought I should warn you both about this....

version 1.0.2

Skrypt commented 3 years ago

I'll revert to 1.0.1 ...

Skrypt commented 3 years ago

Wow this guy did a thumbsup too 😄 Lesson learned, debug this by myself and advise after.

Skrypt commented 3 years ago

Here is the cached folder from the actual web server. (running 1.0.1)

image

Here is the local cached folder with some images which seems corrupted. (running 1.0.2)

image

These are the images that returns the errors above.

JimBobSquarePants commented 3 years ago

Can you share the properties of one of the corrupted files?

Scratch that silly question.

Skrypt commented 3 years ago

I kept the files if you need one but definitely reverting to SixLabors.ImageSharp.Web 1.0.1 fixed the issue after I cleared the cache. Maybe the issue is in OC too because this is our own local cache. Need more testing.

Actually not that silly since the size of that file is 0 bytes.

JimBobSquarePants commented 3 years ago

Maybe the issue is in OC too because this is our own local cache.

Your own cache implementation? This could be a critical factor.

Skrypt commented 3 years ago

I'm moving back to 1.0.2 to find that file in the ImageSharp cache. It's not my own cache it's the OC media cache. The only thing that changed is the SixLabor.ImageSharp.Web assembly recently. So this is why my guess on the issue went there first.

Skrypt commented 3 years ago

Ok found the issue. It's not ImageSharp related. I litterally made my PC crash by adding the Azure Blob Storage DNS in my host file to point to 127.0.0.1

So it's a network and/or local storage issue which corrupted those files in the ms-cache folder. Then it logged that ImageSharp could not process these files.

We need to have something in OC to prevent this from happening.

JimBobSquarePants commented 3 years ago

StackOverflow is a surprise. I normally just get a timeout if I'm testing my Azure provider locally having forgotten to turn on the local emulator.

Skrypt commented 3 years ago

I'm pretty sure we have something going on in OC because I remember talking about this with @deanmarcussen a while ago. He probably already knows.

I'd be surprised that the issue is my SSD drive. It's just hard to repro a temporary network failure.

Thanks for your helps by the way and sorry for my tagging 😉

deanmarcussen commented 3 years ago

I'm not sure I totally understand what you did here @Skrypt

I think I have seen behaviour from the blob storage client where you can induce a stack overflow by pointing it at a localhost address previously.

i.e. a connection string of UseDevelopmentStorage=true will crash an azure web app (if memory serves)

If this the equivalent of what you've done, then the thing to do would be to repro the issue without Orchard Core, and report it on the Azure Blob Storage repo.

I'm not sure what we could do regarding it, as it seems a network related / blob storage client issue

Skrypt commented 3 years ago

Well, I had the stackoverflow issue all day yesterday. The thing is that these 0 bytes files we're trying to get resized by ImageSharp and my PC was litterally freezing while I was on Skype with @jtkech. So, it's not just the Azure Blob Storage. I kept the files to be able to repro, but I should never had a 0 byte file saved on my drive at first.

To repro the network issue I just decided to point the DNS to a local IP address "so that it fails reaching it". When I start the app in VS Code it properly fails with a exception page. I will try to repro make my PC crash today. Basically, what I did is change the DNS while the app was already running "debugging with VS Code".

I'm using also the DataProtection keys from Azure so it's quite complicated to repro the initial issue. How could I repro getting a 0 byte file in the ms-cache folder?

Skrypt commented 3 years ago

I kept the 0 byte files to test if we could not add a validation before trying to morph these files with ImageSharp. Though ImageSharp seems to properly fail and log these. Now, the other thing is that if my PC crashed or freezed it means that it does when it tries to read these files still or because of a network or ssd drive issue. But there is definitely an infinite loop happening.

Skrypt commented 3 years ago

I'm now using a background task that will empty these cache folder periodically to refresh them as a solution.

Skrypt commented 3 years ago

I've been able to repro the 0 byte file issue.

2021-03-03 16:54:44.4920|Default|00-45b58ec25b8ebb48a5059c0022eb0317-30d4f61589ab2b42-00||OrchardCore.Media.Core.DefaultMediaFileStoreCacheFileProvider|ERROR|Error saving file E:\Repositories\affaires-extra-sc\src\OrchardCore.Cms.Web\wwwroot\ms-cache\Default\machinerie-agricole/accessoire/site-web/4wfzccrve8c9r3q8cn94pw6ddt.png System.Threading.Tasks.TaskCanceledException: The operation was canceled.
 ---> System.IO.IOException: Unable to read data from the transport connection: The I/O operation has been aborted because of either a thread exit or an application request..
 ---> System.Net.Sockets.SocketException (995): The I/O operation has been aborted because of either a thread exit or an application request.
   --- End of inner exception stack trace ---
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.GetResult(Int16 token)
   at System.Net.Security.SslStream.FillBufferAsync[TIOAdapter](TIOAdapter adapter, Int32 numBytesRequired)
   at System.Net.Security.SslStream.ReadAsyncInternal[TIOAdapter](TIOAdapter adapter, Memory`1 buffer)
   at System.Net.Http.HttpConnection.ReadAsync(Memory`1 destination)
   at System.Net.Http.HttpConnection.ContentLengthReadStream.ReadAsync(Memory`1 buffer, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---
   at System.Net.Http.HttpConnection.ContentLengthReadStream.ReadAsync(Memory`1 buffer, CancellationToken cancellationToken)
   at Azure.Core.Pipeline.ReadTimeoutStream.ReadAsync(Byte[] buffer, Int32 offset, Int32 count, CancellationToken cancellationToken)
   at Azure.Core.Pipeline.RetriableStream.RetriableStreamImpl.ReadAsync(Byte[] buffer, Int32 offset, Int32 count, CancellationToken cancellationToken)
   at Azure.Core.Pipeline.RetriableStream.RetriableStreamImpl.RetryAsync(Exception exception, Boolean async, CancellationToken cancellationToken)
   at Azure.Core.Pipeline.RetriableStream.RetriableStreamImpl.ReadAsync(Byte[] buffer, Int32 offset, Int32 count, CancellationToken cancellationToken)
   at System.IO.Stream.CopyToAsyncInternal(Stream destination, Int32 bufferSize, CancellationToken cancellationToken)
   at OrchardCore.Media.Core.DefaultMediaFileStoreCacheFileProvider.SetCacheAsync(Stream stream, IFileStoreEntry fileStoreEntry, CancellationToken cancellationToken) in E:\Repositories\affaires-extra-sc\src\OrchardCore\OrchardCore.Media.Core\DefaultMediaFileStoreCacheFileProvider.cs:line 69    at System.Net.Http.HttpConnection.ContentLengthReadStream.ReadAsync(Memory`1 buffer, CancellationToken cancellationToken)
   at Azure.Core.Pipeline.ReadTimeoutStream.ReadAsync(Byte[] buffer, Int32 offset, Int32 count, CancellationToken cancellationToken)
   at Azure.Core.Pipeline.RetriableStream.RetriableStreamImpl.ReadAsync(Byte[] buffer, Int32 offset, Int32 count, CancellationToken cancellationToken)
   at Azure.Core.Pipeline.RetriableStream.RetriableStreamImpl.RetryAsync(Exception exception, Boolean async, CancellationToken cancellationToken)
   at Azure.Core.Pipeline.RetriableStream.RetriableStreamImpl.ReadAsync(Byte[] buffer, Int32 offset, Int32 count, CancellationToken cancellationToken)
   at System.IO.Stream.CopyToAsyncInternal(Stream destination, Int32 bufferSize, CancellationToken cancellationToken)
   at OrchardCore.Media.Core.DefaultMediaFileStoreCacheFileProvider.SetCacheAsync(Stream stream, IFileStoreEntry fileStoreEntry, CancellationToken cancellationToken) in E:\Repositories\affaires-extra-sc\src\OrchardCore\OrchardCore.Media.Core\DefaultMediaFileStoreCacheFileProvider.cs:line 69

While testing this background task I've been spamming the pager of my website to see what would happen if these files we're getting queued up to get resized by ImageSharp. No issue there, the problem though is that the request gets cancelled and therefore it leaves a 0 byte file there on the drive.

https://github.com/OrchardCMS/OrchardCore/blob/44d034e0a4c16464748d8461ca9bcc9d307b6ede/src/OrchardCore/OrchardCore.Media.Core/DefaultMediaFileStoreCacheFileProvider.cs#L59-L89

It's an annoying issue that fills up my logs. So, having fun debugging this today 😉

And one thing I've noticed which would have made my life easier is to be able to purge the is-cache folder from ImageSharp without getting locked files. I made it work without throwing any log today, though it will log issues when I'm trying to remove a folder that just got a file added in while waiting on lock to release. So, purging the cache from the ImageSharp API would have been easier. Maybe not for Jim but for me :wink:

Skrypt commented 3 years ago

Rethinking about the stackoverflow I got yesterday. It did happen again today. But it was happening when I got the issue from the previous post leaving me to think these Azure Blob Storage requests that gets cancelled queues up and then made the stackoverflow happen. But, I'm not sure yet. I think the stackoverflow is a collateral. Maybe an issue with the Azure Blob Storage client like Dean said yesterday (leaving notes to myself).

Piedone commented 5 months ago

Since you say this may be an issue with the client library, what we kept up-to-date since then, perhaps this is fixed now? Do you still experience this?

github-actions[bot] commented 4 months ago

It seems that this issue didn't really move for quite a while despite us asking the author for further feedback. Is this something you'd like to revisit any time soon or should we close? Please reply.

github-actions[bot] commented 4 months ago

Closing this issue because it didn't receive further feedback from the author for very long. If you think this is still relevant, feel free to reopen it with the requested details.