Closed DeepDiver1975 closed 8 months ago
Thanks Thomas, appreciated. However, as it is we can't merge it: the (expected) change in the writefile()
signature must be reflected in all storage interfaces, in particular in the xroot one that we still use in production. And in addition, we'd need Reva to support a stream by chunking it in multiple HTTP PUT requests, right?
But before going further, I challenged in #134 the usefulness of such patch. Unless proven wrong, we won't gain anything from it.
However, as it is we can't merge it: the (expected) change in the
writefile()
signature must be reflected in all storage interfaces,
This will for sure follow as soon as we agree on an approach ;-)
This code change will help without any other change - simply because it eliminates to copy the data from the incoming request body to RAM before writing it into the outgoing request.
flask.request.get_data
is reading all bytes from the underlying stream to the variable rv
- this is where the data is copied unnecessarily.
Just to make sure we are on the same page - writing a file is a POST and as per definition the full file will be transmitted to the server before (in contrary to PUT). Whenever I am referring to " streaming" in this context the meaning is "stream data in the code".
Hope this helps to clarify. THX
This will for sure follow as soon as we agree on an approach ;-)
OK sure - see below!
Just to make sure we are on the same page - writing a file is a POST and as per definition the full file will be transmitted to the server before (in contrary to PUT). Whenever I am referring to " streaming" in this context the meaning is "stream data in the code".
Right, we are on the same page then, the request.get_data()
buffer contains the full file by the time we start processing the POST request. Now, whether the request.stream
file-like object avoids a memory copy, I don't actually know - I assumed that the buffer (my content
parameter) would be passed as an implicit reference to the underlying call, not as a copy.
So I'd say before agreeing on an approach and defining the signature for writefile()
, could this hypothesis be tested? This is a question to @wkloucek ;-) could you compare the memory consumption on large files with and without this PR?
Superseded by #141
refs #134
@glpatcern please have a look - thx