s3tools / s3cmd

Official s3cmd repo -- Command line tool for managing S3 compatible storage services (including Amazon S3 and CloudFront).
https://s3tools.org/s3cmd
GNU General Public License v2.0
4.57k stars 906 forks source link

put zero-length file from stdin fails #465

Open mdomsch opened 9 years ago

mdomsch commented 9 years ago

$ cat /dev/null | ./s3cmd --debug put - s3://s3cmd-test.domsch.com/devnull

[trimmed]

DEBUG: Canonical Request: POST /devnull uploadId=IBQwin.yGECrwLOGh8fkzHbtN.y0bdn.CdhOUhpjWtyo3UDA7F5BjsJYQuwW7tSj3HHr6xJX5WqAoCKbC3.Zog-- content-length:51 host:s3cmd-test.domsch.com.s3.amazonaws.com x-amz-content-sha256:6e842b84ed8c34f3a66be9ed68a3e53471da531989f7efd4dbaf7bc54b0e1f31 x-amz-date:20150121T223832Z

content-length;host;x-amz-content-sha256;x-amz-date

6e842b84ed8c34f3a66be9ed68a3e53471da531989f7efd4dbaf7bc54b0e1f31

DEBUG: signature-v4 headers: {'x-amz-content-sha256': '6e842b84ed8c34f3a66be9ed68a3e53471da531989f7efd4dbaf7bc54b0e1f31', 'content-length': 51, 'Authorization': 'AWS4-HMAC-SHA256 Credential=REMOVED/20150121/us-east-1/s3/aws4_request,SignedHeaders=content-length;host;x-amz-content-sha256;x-amz-date,Signature=529ab2de89e0f22f1960872104570026c8c9008fa8a68cd6446b10f8a1be2d8d', 'x-amz-date': '20150121T223832Z'} DEBUG: Processing request, please wait... DEBUG: get_hostname(s3cmd-test.domsch.com): s3cmd-test.domsch.com.s3.amazonaws.com DEBUG: ConnMan.get(): re-using connection: https://s3cmd-test.domsch.com.s3.amazonaws.com#2 DEBUG: format_uri(): /devnull?uploadId=IBQwin.yGECrwLOGh8fkzHbtN.y0bdn.CdhOUhpjWtyo3UDA7F5BjsJYQuwW7tSj3HHr6xJX5WqAoCKbC3.Zog-- DEBUG: Sending request method_string='POST', uri='/devnull?uploadId=IBQwin.yGECrwLOGh8fkzHbtN.y0bdn.CdhOUhpjWtyo3UDA7F5BjsJYQuwW7tSj3HHr6xJX5WqAoCKbC3.Zog--', headers={'x-amz-content-sha256': '6e842b84ed8c34f3a66be9ed68a3e53471da531989f7efd4dbaf7bc54b0e1f31', 'content-length': 51, 'Authorization': 'AWS4-HMAC-SHA256 Credential=REMOVED/20150121/us-east-1/s3/aws4_request,SignedHeaders=content-length;host;x-amz-content-sha256;x-amz-date,Signature=529ab2de89e0f22f1960872104570026c8c9008fa8a68cd6446b10f8a1be2d8d', 'x-amz-date': '20150121T223832Z'}, body=(51 bytes) DEBUG: Response: {'status': 400, 'headers': {'x-amz-id-2': 'NMaNuC3MLTW6QTExYz3OWZaPcKT7eOaZAmW5siDxz1twtbUTQYj0ljJ37wKBJiSl2/ED0DV/9/A=', 'server': 'AmazonS3', 'transfer-encoding': 'chunked', 'connection': 'close', 'x-amz-request-id': 'A9F00CE6D919D8F4', 'date': 'Wed, 21 Jan 2015 22:38:31 GMT', 'content-type': 'application/xml'}, 'reason': 'Bad Request', 'data': 'MalformedXMLThe XML you provided was not well-formed or did not validate against our published schemaA9F00CE6D919D8F4NMaNuC3MLTW6QTExYz3OWZaPcKT7eOaZAmW5siDxz1twtbUTQYj0ljJ37wKBJiSl2/ED0DV/9/A='} DEBUG: ConnMan.put(): connection put back to pool (https://s3cmd-test.domsch.com.s3.amazonaws.com#3) ERROR: S3 error: The XML you provided was not well-formed or did not validate against our published schema

mdomsch commented 9 years ago

exceedingly low priority...

ledjon commented 9 years ago

I wouldn't say it is a zero priority. /dev/null might not be something that gets uploaded, but there is still the need to upload "zero byte" files from stdin.

mdomsch commented 9 years ago

Uploads from stdin go through our multipart upload code path, because we can't know in advance how long the file from stdin will be (unless we first read it and buffer it in memory or to a tempfile on disk, which defeats the purpose of reading from stdin).

The multipart upload API is designed to be used for large files, with chunk sizes (except for the last) to be 5MB or larger. We can read a 1-byte file from stdin and upload that as the first (and last) chunk and still be within the API's definition. But we can't send zero chunks and have the API succeed (it fails with malformed XML when completing the request, as zero chunks are specified).

We might could send one zero-byte chunk, but then we have to distinguish between an empty buffer for a zero-byte file, and an empty buffer that really means "go open the file handle already passed and try reading it. The code goes to great pains right now to use either a non-null buffer, or an actual file, but not a null buffer. I'd rather not add that bit of convolution.

On Fri, Jan 23, 2015 at 9:57 AM, ledjon notifications@github.com wrote:

I wouldn't say it is a zero priority. /dev/null might not be something that gets uploaded, but there is still the need to upload "zero byte" files from stdin.

— Reply to this email directly or view it on GitHub https://github.com/s3tools/s3cmd/issues/465#issuecomment-71214463.

ledjon commented 9 years ago

Understood. I'm just trying to say that if somebody is using s3cmd in scripting copying of files (or reading from stdin) that there is a solid chance that some zero-byte inputs will come from time to time. I don't personally have this problem, so I agree it is low priority as well.

mdomsch commented 8 years ago

No - to know the length of an open stream file descriptor, you have to read the whole stream (into memory, or onto disk, or somesuch). That's generally not what you want to do...

On Fri, May 27, 2016 at 5:46 AM, Marius OLAR notifications@github.com wrote:

I have this problem, so how can I handle this kind of situation? There is a way to tell the full length of the file before it starts, in order to handle this error?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/s3tools/s3cmd/issues/465#issuecomment-222116926, or mute the thread https://github.com/notifications/unsubscribe/AAqDqv5lay-w0s44rVyZe8JUFqH4aLC0ks5qFsuJgaJpZM4DVgxe .