laravel / framework

The Laravel Framework.
https://laravel.com
MIT License
32.7k stars 11.06k forks source link

`Storage::download()` crashes due to memory fault, and is too implementation-specific #36249

Closed uberbrady closed 3 years ago

uberbrady commented 3 years ago

Description:

Some of our customers have private files that are not on any 'public' disk. Before anyone is allowed access to them, they need to be logged-in, and pass the appropriate gates. At the end of that process we do the Flysystem-recommended return Storage::download($filename);. If the file is sufficiently large, this results in an out-of-memory error when using local storage.

If I replace that with: return response()->download(Storage::path($filename));, the file can be downloaded.

However, that's going to be a problem as well - because we're going to be moving to S3 at some point, and that command won't work with non-local files. For those, we'd probably want to return a redirect to a signed S3 URL (our Private S3 configuration does not have public access, but files that are signed for a sufficiently brief duration should be fine)

What we're going to end up doing is something like:

switch(config('filesystems.default')) {
    case 'local':
    return response()->download(Storage::path($filename));

    case 's3':
    return redirect()->whatever->s3SignedUrl(Storage::path($filename));

    default:
    return Storage::download($filename);
}

Which seems to go against the idea of a design of a filesystem abstraction.

For our 'public' disk it's probably less of a problem, as we can just return a redirect to the ->url() and be done with it, regardless of which driver we're using. But for symmetry, it'd be nice to say Storage('public')::download($filename) and have it just redirect you to the ->url().

(edit I yanked a paragraph here, it muddied the issue)

Steps To Reproduce:

Make a 1GB file using: dd if=/dev/zero of=bigtest bs=1m count=1024 and try to download it. (or pick a larger file if your PHP's RAM max is higher than that). Try to download it using Storage::download(). Receive out-of-memory error in storage/logs/laravel.log and the browser download should abort.

taylorotwell commented 3 years ago

There isn't really an actionable "bug" for us to address here. You can either not allow files that are so large to be served by your web server or you could use S3 during local development as well.

Of course, a PR to stream-line anything you have in mind can be evaluated if you choose to write one.

VeryStrongFingers commented 3 years ago

@uberbrady I created a PR to address the issue you describe, but there appears to be a memory leak in Flysystem's v1.* Local Adapter (that derives from PHP internal). Specifically here https://github.com/thephpleague/flysystem/blob/1bf07fc4389ea8199d0029173e94b68731c5a332/src/Adapter/Local.php#L323 which overall appears to be an upstream PHP issue

References

so whilst the changes in my PR are still valid and part of the resolution, it's only step 1 out of 2. You'll need to wait for those fixes to merged & watch out for the future PHP version(s) containing that bugfix.

taylorotwell commented 3 years ago

OK. Since I'm not sure what actionable steps we need to take at the moment I'm just going to close this issue so it's off of my plate until we have something concrete to do.

uberbrady commented 3 years ago

@VeryStrongFingers - Regarding the PHP bugs - if you do file_get_contents() on a big file, you're going to get a big result. I don't think that's what I'm talking about. There's literally no possible solution for that. A buffer of 'x' size must be allocated to handle a file of 'x' size, if you're loading up the entire file. It's fair to say that any kind of 'magic' file detection should probably take a small-ish enough prefix of the file to determine its type (e.g. if you have to read through 3 GB of file to determine that the 3GB file is of 'y' type, then the 'magic' algorithm is wrong). That's a completely fair argument to make with PHP's "magic" algorithm, but that's not what I'm talking about here. For the 'magic' algorithm, it could be argued that some kind of streaming solution would work better to help determine file types, but, again, that's not the problem here I don't believe. Please do correct me if I'm wrong.

This is, as I would argue, an issue where the Storage facade is naively downloading the entire file, and then trying to return it to the end-user, when there are some streaming API's we can use instead.

I have a few ideas for some fixes. Certainly, the one that I like the best so far is one where, in each Flysystem adaptor, it has to handle Storage::download() in a way that makes sense for that storage system. Examples (from the top of my head) are:

If you don't like that idea, another option would be to expose a streaming endpoint for the Storage adaptor - Storage::stream($file) - that gives you a PHP file handle. I don't know whether or not AWS-SDK's S3 implementation provides enough of an implementation for us to do that, but if it did that would allow you to stream arbitrarily large files back through Laravel using the response()->download() codepath - and I suppose we could make Storage::download() just secretly hit the stream endpoint, if it exists.

I'm happy to at least provide a proof-of-concept PR in a bare Laravel project to show my thinking. It's fine if it doesn't get accepted, I just want to be sure it will at least be considered before I spend the time.