dotnet / aspnetcore

ASP.NET Core is a cross-platform .NET framework for building modern cloud-based web applications on Windows, Mac, or Linux.
https://asp.net
MIT License
35.19k stars 9.93k forks source link

Enable a mode in which FormFiles are backed by files with the uploaded file name #20602

Closed drauch closed 3 years ago

drauch commented 4 years ago

We receive an IFormFile in some of our actions and we need the file on disk to have the original name. At the moment we need to copy the whole file like so:

var path = Path.Combine(tempDirectory, Guid.NewGuid().ToString());
var filePath = Path.Combine(path, Path.GetFileName(formFile.FileName));
Directory.CreateDirectory(path);

using(var stream = formFile.OpenReadStream())
using(var file = File.Create(path))
    await stream.CopyToAsync(file);

We have to do this because there is no way to get the underlying path from IFormFile and move the form file to have the correct file name OR to instruct FormFile to immediately use the original file name.

We haven't found a way to do this without copying the whole file again (which is a performance problem).

We're using ASP.NET Core 2.1.x

Best regards, D.R.

davidfowl commented 4 years ago

I'm a bit confused. When you say you need the original path, do you mean the file name? Or the full path on the source machine?

var path = Path.Combine(tempDirectory, Guid.NewGuid().ToString()); var filePath = Path.Combine(path, Path.GetFileName(formFile.FileName)); Directory.CreateDirectory(path);

I assume you mean the path because formFile is an IFormFile and the FileName is exposed there. Why do you need the original path? Can you give an example of the upload form and what you expect the user to put into that form and what you expect to get on the server side as a result (a specific example would be very useful). Also this file name comes from the client so it can be anything...

We have to do this because there is no way to get the underlying path from IFormFile and move the form file to have the correct file name OR to instruct FormFile to immediately use the original file name.

Where is the file ultimately going?

drauch commented 4 years ago

No, I don't need the path on the source machine. I need the temporary file, the one ASP.NET Core creates on the disk, to have the same name as the file originally had on the client. Yes, the client can send us bogus data, but then it's his own fault that the request will ultimately fail.

The file is going into Windows Authenticode signing which embeds the "original file name" for some file types into the signature. This original file name must match the real file name, otherwise the signature will be deemed untrustworthy. Unfortunately, you cannot pass this original file name to the Authenticode APIs, they assume the file name the file currently has. Which is why we need to have the correct file name on disk on the server.

Now we can either rename the existing ASP.NET Core temporary file (which we can't, because we don't get the Path from IFormFile) OR we need to instruct ASP.NET Core to use the file name the client provided.

(Of course those file names could always be a.exe, and in order to not overwrite each other we need to put them into different directories - which is the reason why we use Guid.NewGuid() as a directory).

davidfowl commented 4 years ago

I see. So you need the file to be on disk and you need it to have the "right" file name and if ASP.NET Core creates a temporary file, you'd like to avoid copying it to another file with the right file name because that's wasteful (even worse because you need to copy it back in user mode instead of file a File.Copy). Did I get that right?

Assuming I did, here are a few pieces of info:

The above thresholds can be controlled by configuring FormOptions (https://docs.microsoft.com/en-us/dotnet/api/microsoft.aspnetcore.http.features.formoptions?view=aspnetcore-3.1). By default the entire body is not buffered (buffer body is false), but each file section is buffered by default and you can control when it goes to disk by setting FormOptions.MemoryBufferThreshold.

That said, there's no way to get the underlying file name of the temp file that is buffered to disk. I'd recommend using the lower level APIs in order to read the multipartbody directly (linked in the article below) and use that to save to disk with the right name.

PS: Also see https://docs.microsoft.com/en-us/aspnet/core/mvc/models/file-uploads?view=aspnetcore-2.1 for security considerations and more docs about file upload in general.

Tratcher commented 4 years ago

If your goal is to skip a file copy and get the name you want then @davidfowl is right, you should switch to the streaming upload pattern: https://docs.microsoft.com/en-us/aspnet/core/mvc/models/file-uploads?view=aspnetcore-2.1#upload-large-files-with-streaming-1

Tratcher commented 4 years ago

(which is a performance problem)

Have you actually measured the impact of this extra copy? How large are the files and how long does that extra copy take now?

drauch commented 4 years ago

I see. So you need the file to be on disk and you need it to have the "right" file name and if ASP.NET Core creates a temporary file, you'd like to avoid copying it to another file with the right file name because that's wasteful (even worse because you need to copy it back in user mode instead of file a File.Copy). Did I get that right?

Yes, exactly right.

I'd recommend using the lower level APIs in order to read the multipartbody directly you should switch to the streaming upload pattern

you should switch to the streaming upload pattern

This would be unfortunate for us. We have lots of form data too, model binding should still work. Can we use the "normal" way with model binding for all the other parameters and go for the streaming solution only for the form file? Maybe we can adjust model binding somehow with a decorator? Or a configuration hook of some sort?

Also, just to make sure, the docs say smth about JavaScript, but we are talking about a non-browser client. But this seems to work for arbitrary clients - you also don't think that a browser with JS support is somehow necessary for our scenario?

Have you actually measured the impact of this extra copy? How large are the files and how long does that extra copy take now?

The impact is a few seconds for each request. The files are about 100-500 MB. But we're planning to increase the upper limit to a GB.

Best regards, D.R.

Tratcher commented 4 years ago

Have you actually measured the impact of this extra copy? How large are the files and how long does that extra copy take now?

The impact is a few seconds for each request. The files are about 100-500 MB. But we're planning to increase the upper limit to a GB.

That's pretty good considering. What's the total request time for these?

This would be unfortunate for us. We have lots of form data too, model binding should still work. Can we use the "normal" way with model binding for all the other parameters and go for the streaming solution only for the form file? Maybe we can adjust model binding somehow with a decorator? Or a configuration hook of some sort?

Not with the documented pattern. The problem is that form fields and files can be intermixed in the request, you have to read the whole body to know that you've found all of the form fields.

To replace only how the buffering happens you'd likely need a middleware or filter that replaced the IFormFeature. Something like this: https://github.com/dotnet/aspnetcore/blob/c565386a3ed135560bc2e9017aa54a950b4e35dd/src/Mvc/Mvc.Core/src/Filters/RequestFormLimitsFilter.cs#L39-L51

davidfowl commented 4 years ago

Also, just to make sure, the docs say smth about JavaScript, but we are talking about a non-browser client. But this seems to work for arbitrary clients - you also don't think that a browser with JS support is somehow necessary for our scenario?

No javascript doesn't matter here.

To replace only how the buffering happens you'd likely need a middleware or filter that replaced the IFormFeature. Something like this:

But you would also need to re-implement the file reading since you can't really turn off the implicit buffering that happens without turning off the memory threshold (which would result in more virtual memory usage 😄 ). Memory efficient file uploads really should be using the streaming pattern regardless.

It might make sense to write a custom model binder for IFormFile that uses the low level APIs to read files from the client and saves them to disk using a name that makes sense for your scenario.

Tratcher commented 4 years ago

But you would also need to re-implement the file reading since you can't really turn off the implicit buffering that happens without turning off the memory threshold

You can copy and tweak FormFeature. Mainly this section: https://github.com/dotnet/aspnetcore/blob/7946b91a65ce0ff717bdcf528f6342edc5bfd92d/src/Http/Http/src/Features/FormFeature.cs#L175-L193

It might make sense to write a custom model binder for IFormFile that uses the low level APIs to read files from the client and saves them to disk using a name that makes sense for your scenario.

Not needed.

analogrelay commented 4 years ago

Triage notes: Moving to discussions for now. We don't plan to change the existing feature to support this at this time given the relatively low impact of the extra copy and the high cost of exposing these internals. Happy to keep discussing how an alternate FormFeature could be used here, but it's not something we plan on changing in the product at this time.

drauch commented 4 years ago

@davidfowl & @Tratcher Thank you for your contributions. It looks like this is no "low-hanging fruit" and needs a bit of effort on our side, which is why we've decided to not follow through with implementing this in the next few weeks.

When it is scheduled on our side, I'll come back and update this thread with more information about whether your advice has helped me to achieve my goal and what kind of adjustments have been necessary in addition.

@anurse: OK, I was hoping this is a framework-thingy, but I can also see why you would go for more important issues before implementing this.

Happy easter! D.R.

ghost commented 3 years ago

Thank you for contacting us. Due to a lack of activity on this discussion issue we're closing it in an effort to keep our backlog clean. If you believe there is a concern related to the ASP.NET Core framework, which hasn't been addressed yet, please file a new issue.

This issue will be locked after 30 more days of inactivity. If you still wish to discuss this subject after then, please create a new issue!