Open khagler opened 12 years ago
Sorry you're having problems and I appreciate the report. I also seen an upload failure which I assumed was some kind of network problem so I didn't save the traceback. Can you tell me if how this reproduces - every time, intermittent (and if so what sort of proportion), and is there any pattern to how much data is uploaded before it fails?
My own single failure prompted me to write automatic resume upload support. This is working and I expect to push it shortly. But despite that I'd like uploads to work first time!
It happens every time. I don't know how much data is being uploaded, but it runs for a pretty long time before failing. I'm almost certain that what's happening here is that it starts the upload with the default 4 MB part size, and then 40,000 MB worth of uploading later it tries to upload the next part and Amazon rejects it because the 10,000 part limit has been reached. Exactly how long that takes varies depending on what else I'm doing with my connection at the time (and how many of my neighbors are bittorrenting their favorite TV shows ;-), which accounts for the variable but long time to failure.
I've written a fix that checks the size of the archive to be uploaded and determines the smallest part size that will work if 4 MB is too small. I created a 50 GB dummy file, and found that it did indeed fail to upload as expected without the fix. I'm trying it now with the fix, and it's still running. I'll update when it eventually either finishes or fails.
Same here. 301MB - OK 2.6GB - OK 37GB - failed
Based on the code, it looks like the problem is that the particular boto.glacier method I'm using doesn't let me pick a part size and chooses 4 MiB arbitrarily. So a suitable fix would be to automatically determine a suitable part size as khagler described, but I think this would need to go into boto rather than glacier-cli.
khagler: is this what you're working on, or shall I?
Yes, basically. I had modified your archive_upload so that it modified vault.DefaultPartSize, but I agree that this really ought to be done in boto, so I'll see about moving it there.
I've run a few tests, and while my fix does take care of the original problem, it exposes a new one: It seems to be pretty common for individual part uploads to fail (I've been seeing about a 1% failure rate in FastGlacier, which tells you when it happens). Unfortunately, boto doesn't seem to have a way to note this and retry failed parts.
I am also getting this. Tried with 4GB splits and it failed. Tried with 1023M splits and it also failed. I can't realistically go smaller. Any hope of a fix for this? The same 1023M split uploaded without troubles in "simple amazon glacier uploader" but I prefer command line tools.
socket.gaierror: [Errno 8] nodename nor servname provided, or not known
Full error text:
Traceback (most recent call last):
File "/Users/jonathan/glacier/glacier-cli/glacier", line 694, in
I'm running into this issue as well with large files (> 1GB), same error:
Traceback (most recent call last):
File "./glacier.py", line 730, in
This is an issue within boto, not in glacier-cli directly. Please could anyone still affected post the version of boto you're using, and try the latest?
Was seeing the issue with boto 2.5.2 but I just updated to 2.9.6 and the issue persists.
I too am having this problem
Attempts to upload a very large (288.64 GB) file run for about an hour or two, then fail with the following output:
OS: Mac OS X 10.7.5 Server Python 2.7.1
I tried uploading the same file using FastGlacier with the part size set to 1 GB. It would upload some of each part before failing with a message about the remote host dropping the connection. After setting the part size to 256 MB, it was able to upload individual parts successfully.
Addendum:
After a bit more investigation, I think I've figured out what might be going on. According to the Amazon documentation, the maximum number of parts for a multi-part upload is 10,000. For this (very large) archive to be split evenly into 10,000 parts, each part would have to be about 27.5 MB--or, given the limits on allowable part sizes, 32 MB. It looks like you're using a default part size (which I didn't realize at the time I could change) of 8 MB. If I'm right about that, then an 80 GB file would be a (marginally less painful) valid test.