wyllys66 / s3multi

S3 Multipart Upload Middleware for Openstack Swift proxy
Apache License 2.0
2 stars 6 forks source link

Error happened during "CompleteMultipartUpload" #2

Closed mispro closed 10 years ago

mispro commented 10 years ago

Hi,

After I install "s3multi" with "swift" in OpenStack, it works for starting the multipart upload, and can upload all data segments to the "_segments" container. However, when I call "completeMultipartUpload", it failed with a strange error "NoSuchBucket" from the respond XML message, and from /var/log/message, there is a message like:

Jan 15 10:19:13 localhost proxy-server ERROR WSGI: code 400, message Bad request syntax ('<?xml version="1.0" encoding="UTF-8" standalone="no"?>1157832f52f26c4d8f533d617fba200302afa936a26cab1526de176445368b2639DELETE /CloudBacko/CloudTest2%252FLargerThan5M%252Fapache-tomcat-6.0.36-windows-x86.zip.7674a0.13c83f3367e.cgz?uploadId=7457a81c11a6d2c3b080698754e3e140 HTTP/1.1') (txn: tx103c97d3662f4776acafe-0052d5f021) (client_ip: 192.168.8.174)

(I have just chosen a file with size > 5M, and separate it into 2 segments in my test)

Any idea?

Thanks a lot!

MP

mispro commented 10 years ago

Hi,

After I have revised my httpclient in my client software, which disable "expect continue", no more message like "ERROR WSGI" from proxy-server. However, the response XML is still "NoSuchBucket". Would you mind to have a look and see if any problem will my "CompleteMultipartUpload" request message?

Thanks, MP

(P.S. the container name is "CloudBacko")

POST /CloudBacko/CloudTest2%252FLargerThan5M%252Fapache-tomcat-6.0.36-windows-x86.zip.7674a0.13c83f3367e.cgz?uploadId=068e6edddab94baa079571857bcde3a3 HTTP/1.1 Date: Thu, 16 Jan 2014 04:01:51 GMT Content-Type: text/plain Authorization: AWS ae2495cd4fb64606bf7fa2f534ed1f6b:r+B+ZRhUaGbY9x2vK0P9D3byYAU= Content-Length: 321 Host: os1.dev.test:8080 Connection: Keep-Alive User-Agent: JetS3t/0.9.0 (Windows XP/5.1; x86; zh; JVM 1.6.0_34)

<?xml version="1.0" encoding="UTF-8" standalone="no"?>18a7e0cd7ea362625ad8ee3d7dab6bd1f2ff8f81fd0da8a8f3acb6ec0129b4fb23

wyllys66 commented 10 years ago

its hard to tell. Is the body of the POST empty?

mispro commented 10 years ago

No, the POST is not empty. It has an XML message as its HTTP body, the with "CompleteMultipartUpload" tag.

I am sorry that I haven't use "syntax highlighting" in my last post for the XML message. The messages should be:

<?xml version="1.0" encoding="UTF-8" standalone="no"?><CompleteMultipartUpload xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Part><PartNumber>1</PartNumber><ETag>8a7e0cd7ea362625ad8ee3d7dab6bd1f</ETag></Part><Part><PartNumber>2</PartNumber><ETag>ff8f81fd0da8a8f3acb6ec0129b4fb23</ETag></Part></CompleteMultipartUpload>
wyllys66 commented 10 years ago

I think perhaps the module is having trouble parsing the container/object pathname - what is the path to the bucket and object you are trying to use?

mispro commented 10 years ago

The bucket name I have been used is "CloudBacko". (as in the POST message, i.e. POST /CloudBacko/CloudTest2 ...) And my test server is "os1.dev.test" (as in header "Host", i.e. Host: os1.dev.test:8080)

It is a little bit different between S3 and OpenStack, as S3 bucket name could be used to determine the location information, i.e. if for my case, in S3 spec should be

POST /CloudTest2%252F ... ... Host: CloudBacko.os1.dev.test:8080 ...

However, for OpenStack, most of the time all buckets will be in the same location (internal cloud server), I simply make it as

POST /CloudBacko/CloudTest2%252F ... ... Host: os1.dev.test:8080 ...

such that less restriction in the naming in the bucket names.

Would this affect the result in "s3multi" with OpenStack/Swift? In my testing environment, for uploading objects without using "S3 multipart upload", it works. Since OpenStack/Swift does not provide "multipart upload" API that same as S3, I would like to try and see "s3multi" works.

Hope you can have some hint to the problem.

Thanks in advance! MP

wyllys66 commented 10 years ago

s3multi is intended to bring the S3 multi-upload support to your swift stack in conjunction with the already existing "swift3" middleware. I have tested it and verified that it does work. It seems you have uncovered a bug, though.

Why are you using the "%252F" in your POST pathname instead of just a "/" separator?

mispro commented 10 years ago

Oh yes, it might be the problem which I have overlook, as my application work with S3, and I didn't look carefully with the "POST" message itself.

After further checking, I have found the reason why "%252F" appeared. It should be come from the response message of "StartMultiupload".

Below is the XML part from S3:

<?xml version="1.0" encoding="UTF-8"?><InitiateMultipartUploadResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Bucket>cloud-test-bucket</Bucket><Key>CloudTest2/LargerThan5M/apache-tomcat-6.0.36-windows-x86.zip.7674a0.13c83f3367e.cgz</Key><UploadId>71939da3-52ea-4268-8f0e-5015ea2f5d8e</UploadId></InitiateMultipartUploadResult>

while, below is the XML part from OpenStack + S3multi:

<?xml version="1.0" encoding="UTF-8"?>
<InitiateMultipartUploadResult xmlns="http://doc.s3.amazonaws.com/2006-03-01/">
<Bucket>CloudBacko</Bucket>
<Key>CloudTest2%2FLargerThan5M%2Fapache-tomcat-6.0.36-windows-x86.zip.7674a0.13c83f3367e.cgz</Key>
<UploadId>caa91e4f476a832f59e3d444120e17b5</UploadId>
</InitiateMultipartUploadResult>

The difference is found in the "Key" tag. The reponse from S3 has decoded "%2F" from the last "POST" request, while the one from OpenStack has not. As I am using a Java client library to make the request, I think the library has taken the value from "Key" tag as it later "CompleteMultipartUpload" request. Thus, the request URL has been encode again, then "%2F" has been changed to "%252F".

Can this be fixed in "S3multi"? It would be perfectly compatiable with S3 if this can be done in "S3multi".

Thanks for your help!

wyllys66 commented 10 years ago

Question - in your original request HTTP headers issued by the Java client library, was the "encoding-type" header included?

The s3multi module uses XML-Encoding for the values in the "Key" and "Bucket" fields so that there is no ambiguity in the XML parsing. Either the client library or the client application itself should XML-decode the value before using it. That would result in the proper URL Path for the POST at the end.

wyllys66 commented 10 years ago

I merged several changes this morning, including a fix for the multipart upload completion. Im not sure if it will fix your issue or not since yours seems a little different and has to do with the XML encoding of the object name.

Try pulling the latest from the master and report back here with the results.

mispro commented 10 years ago

For the object name (key) in the URL, usually UTF-8 encoding is used, so as the Java client library that I am using.

It is definitely OK to encode the XML message with XML encoding. However, the problem is, the values from the response messages has been trusted and has be reused in later "POST" call, which will be UTF-8 encode once more.

Sure I will report my testing results once it is ready. However, my testing environment is busy with other tests, which I cannot upgrade the latest "s3multi" by today. Sorry to keep you waiting.

Again, really thanks for your quick response.

wyllys66 commented 10 years ago

The problem I see is that the object "key" value in the response that the client receives is xml_encoded, but the client is not xml-decoding the value before putting it into the POST, where it gets URL-Encoded which is adding the extra escape sequences to the URL. The client (or client library) must xml-decode the values before using them again in the POST.