umbraco-community / UmbracoFileSystemProviders.Azure

:cloud: An Azure Blob Storage IFileSystem provider for Umbraco
96 stars 67 forks source link

Stream blob back to client rather than read into memory. #156

Closed darrenferguson closed 4 years ago

darrenferguson commented 4 years ago

Was getting out of memory exceptions on large files - shouldn't need to read the whole blob into memory before returning.

darrenferguson commented 4 years ago

@Jeavon marc said you may have an issue with this and large files? Strange because we are running it in prod with some quite big files... what have you encountered?

bruno-casari-shout commented 4 years ago

@darrenferguson @stevetemple Has this issue been addressed? It looks like the change has been reverted on version 1.1.1?

We are getting OutOfMemory exceptions as well when large files (MP4 videos) are being retrieved from Azure Blob Storage (on version 1.0.2). I am assuming it may be due to the issue explained on this thread - all MP4 file being loaded to MemoryStream before writing response back to the client?

Jeavon commented 4 years ago

@Jeavon marc said you may have an issue with this and large files? Strange because we are running it in prod with some quite big files... what have you encountered?

Yes, we had to revert this change as OpenRead() will only read the first 4mb of a blob - https://stackoverflow.com/questions/6911728/cloudblob-openread-is-not-reading-all-data

You can see interesting things like this: image (3)

darrenferguson commented 4 years ago

Yes, we had to revert this change as OpenRead() will only read the first 4mb of a blob - https://stackoverflow.com/questions/6911728/cloudblob-openread-is-not-reading-all-data

4MB is the default block size of Microsoft.WindowsAzure.Storage.Blob.CloudBlob - there is a property you can set called - StreamMinimumReadSizeInBytes.

So I guess the issue lies in how Umbraco manipulates the stream.

I ended up working around the issue by rewriting requests to large media files to something like this: https://gist.github.com/darrenferguson/efda4d68d535c9e4a67963bb847c042e

As you can see it still returns a Stream using OpenRead() but the FileStreamResult class handles the reading of the stream in chunks and returning it to the client correctly.

I'll have a dig into Umbraco and see if I can work out what it is doing. I guess it makes sense that it worked for me in some cases, as different parts of Umbraco may manipulate the stream returned from the file system provider in different ways.

Jeavon commented 4 years ago

@darrenferguson ah I see, I think then that the VPP code would need to be updated to utilise the FileStreamResult

JimBobSquarePants commented 4 years ago

I wouldn't think you should use a FileStreamResult as part of the VPP, rather I would look at how media itself is served to the response in Umbraco. FileStreamResult in itself does nothing of interest, it's the executor that does the stream copying by pooling an interim buffer and reading/writing to and from that buffer.

This is how the copy occurs in ASP.NET Core, I'm sure it'll be very similar in classic. https://source.dot.net/#Microsoft.AspNetCore.Http.Extensions/StreamCopyOperationInternal.cs,43a14722ac80264f

So blockBlob.OpenRead() without the copy is correct.