jubos / fake-s3

A lightweight server clone of Amazon S3 that simulates most of the commands supported by S3 with minimal dependencies
2.94k stars 355 forks source link

Combine step fails in multipart upload (from Java) #96

Open juanotto opened 9 years ago

juanotto commented 9 years ago

I'm getting errors when uploading using multipart from java, using the low level API.

To remove the possibility that our code mess things up I made a small project based on amazon Low-Level API (as described in http://docs.aws.amazon.com/AmazonS3/latest/dev/llJavaUploadFile.html) and just added an option that allows me to set the endpoint right after the client creation:

AmazonS3 s3Client = new AmazonS3Client(new ProfileCredentialsProvider());
s3Client.setEndpoint("http://fakes3vm:4567");
[...the rest is just the same]

I upload several files to S3 without problems, but the same code fails uploading to fakes3.

I declared fakes3vm in /etc/hosts and subdomains for the buckets too (I tried several things, not sure if that last is actually needed) I'm running fakes3 in an ubuntu 14 VM in virtualBox, installed with gem (version is 0.2.1)

The requests are getting there but the last one, that should trigger the combine of the parts fails. Here is the output from fakes3:

Juans-MBP.biomatters.com - - [26/Mar/2015:16:40:32 NZDT] "POST /someKeyName?uploads HTTP/1.1" 200 251
- -> /someKeyName?uploads
Juans-MBP.biomatters.com - - [26/Mar/2015:16:40:32 NZDT] "PUT /someKeyName?uploadId=fa0a9a679fd15d411cc62b869791f03f&partNumber=1 HTTP/1.1" 200 0
- -> /someKeyName?uploadId=fa0a9a679fd15d411cc62b869791f03f&partNumber=1

[...parts 2-8 without problem]

Juans-MBP.biomatters.com - - [26/Mar/2015:16:40:36 NZDT] "PUT /someKeyName?uploadId=fa0a9a679fd15d411cc62b869791f03f&partNumber=9 HTTP/1.1" 200 0
- -> /someKeyName?uploadId=fa0a9a679fd15d411cc62b869791f03f&partNumber=9
[2015-03-26 16:40:36] ERROR NoMethodError: undefined method `Error' for #<FakeS3::FileStore:0x00000001bc7e50>
    /var/lib/gems/1.9.1/gems/fakes3-0.2.1/lib/fakes3/file_store.rb:231:in `block in combine_object_parts'
    /var/lib/gems/1.9.1/gems/fakes3-0.2.1/lib/fakes3/file_store.rb:224:in `each'
    /var/lib/gems/1.9.1/gems/fakes3-0.2.1/lib/fakes3/file_store.rb:224:in `combine_object_parts'
    /var/lib/gems/1.9.1/gems/fakes3-0.2.1/lib/fakes3/server.rb:250:in `do_POST'
    /usr/lib/ruby/1.9.1/webrick/httpservlet/abstract.rb:106:in `service'
    /usr/lib/ruby/1.9.1/webrick/httpserver.rb:138:in `service'
    /usr/lib/ruby/1.9.1/webrick/httpserver.rb:94:in `run'
    /usr/lib/ruby/1.9.1/webrick/server.rb:191:in `block in start_thread'

[...that error is repeated 4 times and then]

Juans-MBP.biomatters.com - - [26/Mar/2015:16:40:38 NZDT] "POST /someKeyName?uploadId=fa0a9a679fd15d411cc62b869791f03f HTTP/1.1" 500 375
- -> /someKeyName?uploadId=fa0a9a679fd15d411cc62b869791f03f
[2015-03-26 16:40:39] WARN  Could not determine content-length of response body. Set content-length of the response or set Response#chunked = true
Juans-MBP.biomatters.com - - [26/Mar/2015:16:40:39 NZDT] "DELETE /someKeyName?uploadId=fa0a9a679fd15d411cc62b869791f03f HTTP/1.1" 204 0
- -> /someKeyName?uploadId=fa0a9a679fd15d411cc62b869791f03f

I don't get much from java with that code, but the exception is caught and tries to abort the upload. It seems to fail too as I see all the parts in the bucket when I use s3cmd:

Juans-MBP:~ juanottonello$ s3cmd ls s3://mybucket
2015-03-26 03:40  15728640   s3://mybucket/fa0a9a679fd15d411cc62b869791f03f_someKeyName_part1
2015-03-26 03:40  15728640   s3://mybucket/fa0a9a679fd15d411cc62b869791f03f_someKeyName_part2
2015-03-26 03:40  15728640   s3://mybucket/fa0a9a679fd15d411cc62b869791f03f_someKeyName_part3
2015-03-26 03:40  15728640   s3://mybucket/fa0a9a679fd15d411cc62b869791f03f_someKeyName_part4
2015-03-26 03:40  15728640   s3://mybucket/fa0a9a679fd15d411cc62b869791f03f_someKeyName_part5
2015-03-26 03:40  15728640   s3://mybucket/fa0a9a679fd15d411cc62b869791f03f_someKeyName_part6
2015-03-26 03:40  15728640   s3://mybucket/fa0a9a679fd15d411cc62b869791f03f_someKeyName_part7
2015-03-26 03:40  15728640   s3://mybucket/fa0a9a679fd15d411cc62b869791f03f_someKeyName_part8
2015-03-26 03:40   9736092   s3://mybucket/fa0a9a679fd15d411cc62b869791f03f_someKeyName_part9

The same fakes3 install seems to work well with s3cmd: when I upload a "big file" with it it seems to be using multipart upload

Juans-MBP:~ juanottonello$ s3cmd put ~/Downloads/Boot2Docker-1.4.1.pkg s3://mybucket
WARNING: Module python-magic is not available. Guessing MIME types based on file extensions.
/Users/juanottonello/Downloads/Boot2Docker-1.4.1.pkg -> s3://mybucket/Boot2Docker-1.4.1.pkg  [part 1 of 9, 15MB]
 15728640 of 15728640   100% in    1s    12.55 MB/s  done
/Users/juanottonello/Downloads/Boot2Docker-1.4.1.pkg -> s3://mybucket/Boot2Docker-1.4.1.pkg  [part 2 of 9, 15MB]
 15728640 of 15728640   100% in    0s    19.23 MB/s  done
/Users/juanottonello/Downloads/Boot2Docker-1.4.1.pkg -> s3://mybucket/Boot2Docker-1.4.1.pkg  [part 3 of 9, 15MB]
 15728640 of 15728640   100% in    0s    20.87 MB/s  done
/Users/juanottonello/Downloads/Boot2Docker-1.4.1.pkg -> s3://mybucket/Boot2Docker-1.4.1.pkg  [part 4 of 9, 15MB]
 15728640 of 15728640   100% in    0s    22.57 MB/s  done
/Users/juanottonello/Downloads/Boot2Docker-1.4.1.pkg -> s3://mybucket/Boot2Docker-1.4.1.pkg  [part 5 of 9, 15MB]
 15728640 of 15728640   100% in    0s    18.86 MB/s  done
/Users/juanottonello/Downloads/Boot2Docker-1.4.1.pkg -> s3://mybucket/Boot2Docker-1.4.1.pkg  [part 6 of 9, 15MB]
 15728640 of 15728640   100% in    0s    20.49 MB/s  done
/Users/juanottonello/Downloads/Boot2Docker-1.4.1.pkg -> s3://mybucket/Boot2Docker-1.4.1.pkg  [part 7 of 9, 15MB]
 15728640 of 15728640   100% in    0s    24.61 MB/s  done
/Users/juanottonello/Downloads/Boot2Docker-1.4.1.pkg -> s3://mybucket/Boot2Docker-1.4.1.pkg  [part 8 of 9, 15MB]
 15728640 of 15728640   100% in    0s    17.99 MB/s  done
/Users/juanottonello/Downloads/Boot2Docker-1.4.1.pkg -> s3://mybucket/Boot2Docker-1.4.1.pkg  [part 9 of 9, 9MB]
 9736092 of 9736092   100% in    0s    18.04 MB/s  done

I tried to change the part size to 15MB in the java code, as s3cmd is using that size successfully but without any luck...

Any ideas on this? I can share my java proj if that makes things easier, but is just a minimal web interface for the aws example.

juanotto commented 9 years ago

By the way, the line that rises the error is https://github.com/jubos/fake-s3/blob/master/lib/fakes3/file_store.rb#L231

It seems to be hexdigests of some part (or all of them) fail: part[:etag] == etag is false.

juanotto commented 9 years ago

Java client does not use quotation marks around the etags in the complete multipart upload request. Other clients use them, so they have to be handled as optional -> trim them away.

I got the pull request that solves it, and still works with s3cmd that uses them. The change will not break any previous compatibility as cleaning the quotation marks with the regex or trim them later is the same for those who use them, adding compatibility with those who does not use them.

The pull req is https://github.com/jubos/fake-s3/pull/98