defnull / multipart

Multipart parser for Python 3
Other
120 stars 33 forks source link

Use NamedTemporaryFile for large file uploads #22

Closed cjwatson closed 3 years ago

cjwatson commented 3 years ago

I'm considering porting zope.publisher from cgi.FieldStorage to multipart (see https://github.com/zopefoundation/zope.publisher/issues/39). This seems reasonably practical, except that zope.publisher currently relies on subclassing cgi.FieldStorage with a make_file method that returns a NamedTemporaryFile rather than a TemporaryFile. There doesn't seem a particular reason why multipart couldn't just use a NamedTemporaryFile instead.

zope.publisher still supports Python 2, so for Python 2 compatibility it'll be necessary to monkey-patch this (unless you're willing to maintain a 0.1.x branch), but in the long term it would be good to be able to drop the monkey-patch.

defnull commented 3 years ago

I'm confused why zope.publisher needs a NamedTemporaryFile. Copying the file based on its name while it is still open is not cross-platform (Windows won't allow that). Closing it deletes it. NamedTemporaryFile only makes sense if the delete parameter is false. Or am I missing something?

cjwatson commented 3 years ago

It's quite possible that nobody has ever cared about making it work on Windows; not sure.

The commit that introduced this was https://github.com/zopefoundation/zope.publisher/commit/fc906385da2faee49f9d7a1fe41303acba648cec; I was mainly just trying to keep tests passing. Perhaps @ctheune can elaborate?

ctheune commented 3 years ago

plop

Uah. What? Ok, my first reaction was: this has to be 15 years old code. And it almost is. It's 14 years old!

So. Looking at that code doesn't tell me much. I think you might be right that we didn't care (enough) about windows. We did have tests and had a buildbot running on windows but apparently the test doesn't try copying. I also don't remember enough about windows WRT copying opened files, my memory only tells me that windows doesn't like deleting open files. I would have expected that copying/reading would be fine ...

defnull commented 3 years ago

Python docs say: "Whether the name can be used to open the file a second time, while the named temporary file is still open, varies across platforms (it can be so used on Unix; it cannot on Windows NT or later). " I assumed that for copying, the file needs to be opened a second time.

Anyway, the point is: I think NamedTemporaryFile has no real benefit as long as the delete parameter is true (default). The downside of NamedTemporaryFile is that files are not properly removed if the python process crashes or is killed, because the close handler is not called, leaving garbage around. TemporaryFile prevents that (at least in POSIX environments) as it immediately unlinks the file. That is why I'd like to avoid hard-coding NamedTemporaryFile if possible.

We could add a hook to change temp file creation, though. That would also help people that want to use a non-standard temp file location. How about that?

cjwatson commented 3 years ago

Yep, a hook sounds entirely reasonable. Thanks.

cjwatson commented 3 years ago

I'm going to close and withdraw this request, since for my zope.publisher changes we decided just to drop the use of NamedTemporaryFile based on essentially the same arguments used in this thread.