uskudnik / amazon-glacier-cmd-interface

Command line interface for Amazon Glacier
MIT License
374 stars 100 forks source link

Upload fails (?), no (useful) error-message #90

Open SitronNO opened 11 years ago

SitronNO commented 11 years ago

I have used this project for some days and I really like it! However, after uploading several files (between 10MB - 550MB in size) to my Amazon Glacier vault without problems, I tried to upload a file with the size of 1.3GB.

It seems to upload every byte, but as it finished I get an exception, and a not too useful error-message for a normal user:

$ glacier-cmd upload Pictures "/opt/amazonglacier/holding/2008.03.23 - Paasken paa Hoels hytte.tar" --description "2008.03.23 - Paasken paa Hoels hytte"

Traceback (most recent call last):ate 328.39 KB/s, average 340.64 KB/s, eta 11:13:57. File "/usr/local/bin/glacier-cmd", line 9, in load_entry_point('glacier==0.2dev', 'console_scripts', 'glacier-cmd')() File "/usr/local/lib/python2.6/dist-packages/glacier-0.2dev-py2.6.egg/glacier/glacier.py", line 751, in main args.func(args) File "/usr/local/lib/python2.6/dist-packages/glacier-0.2dev-py2.6.egg/glacier/glacier.py", line 147, in wrapper return fn(_args, _kwargs) File "/usr/local/lib/python2.6/dist-packages/glacier-0.2dev-py2.6.egg/glacier/glacier.py", line 300, in upload args.name, args.partsize, args.uploadid, args.resume) File "/usr/local/lib/python2.6/dist-packages/glacier-0.2dev-py2.6.egg/glacier/GlacierWrapper.py", line 59, in wrapper ret = fn(_args, _kwargs) File "/usr/local/lib/python2.6/dist-packages/glacier-0.2dev-py2.6.egg/glacier/GlacierWrapper.py", line 194, in glacier_connect_wrap return func(_args, _kwargs) File "/usr/local/lib/python2.6/dist-packages/glacier-0.2dev-py2.6.egg/glacier/GlacierWrapper.py", line 59, in wrapper ret = fn(_args, _kwargs) File "/usr/local/lib/python2.6/dist-packages/glacier-0.2dev-py2.6.egg/glacier/GlacierWrapper.py", line 247, in sdb_connect_wrap return func(_args, _kwargs) File "/usr/local/lib/python2.6/dist-packages/glacier-0.2dev-py2.6.egg/glacier/GlacierWrapper.py", line 59, in wrapper ret = fn(_args, _kwargs) File "/usr/local/lib/python2.6/dist-packages/glacier-0.2dev-py2.6.egg/glacier/GlacierWrapper.py", line 1050, in upload writer.write(part) File "/usr/local/lib/python2.6/dist-packages/glacier-0.2dev-py2.6.egg/glacier/glaciercorecalls.py", line 129, in write data) File "/usr/local/lib/python2.6/dist-packages/boto/glacier/layer1.py", line 625, in upload_part response_headers=response_headers) File "/usr/local/lib/python2.6/dist-packages/boto/glacier/layer1.py", line 78, in make_request data=data) File "/usr/local/lib/python2.6/dist-packages/boto/connection.py", line 910, in make_request return self._mexe(http_request, sender, override_num_retries) File "/usr/local/lib/python2.6/dist-packages/boto/connection.py", line 872, in _mexe raise e socket.gaierror: [Errno -2] Name or service not known

Please tell me if there is anything I can do test, verify or help you in any way!

urandom commented 11 years ago

Are you sure the file has actually been fully uploaded? I almost always get that error now, sometimes when the upload hits 1013MB, and sometimes later. But the upload has never actually completed, which can be seen when you list the multiparts. Unfortunately, I can't even resume the upload process, as it complains that the " Received data does not match uploaded data; please check your uploadid and try again"

offlinehacker commented 11 years ago

This is probably because support for parallel uploads was added. Can you confirm this @vwmarle? I think release has to be made for last version that everything worked and pushed to pip and documentation to readthedocs. On Oct 22, 2012 11:55 PM, "Viktor Kojouharov" notifications@github.com wrote:

Are you sure the file has actually been fully uploaded? I almost always get that error now, sometimes when the upload hits 1013MB, and sometimes later. But the upload has never actually completed, which can be seen when you list the multiparts. Unfortunately, I can't even resume the upload process, as it complains that the " Received data does not match uploaded data; please check your uploadid and try again"

— Reply to this email directly or view it on GitHubhttps://github.com/uskudnik/amazon-glacier-cmd-interface/issues/90#issuecomment-9682340.

SitronNO commented 11 years ago

After turning on debugging, I checked the logfile: Oct 22 17:40:53 DEBUG glacier-cmd Wrote 1016.0 MB of 1.2 GB (ESC[1m85ESC[0m%). Rate 334.69 KB/s, average 343.27 KB/s, eta 17:49:34. That was the last line, so yeah, the upload did not complete. It stopped after 1016MB.

My version is from the October 20th.

wvmarle commented 11 years ago

Parallel uploads until just now was only in my own branch (I notice pull requests have disappeared - will have to check result to see it's all OK as I couldn't rebase myself due to conflicts).

At the moment I have absolutely no idea why the data is rejected, other than that it's truly a different file - and I assume that's not the case.

Only thing that you could do and what may give me a clue is to set logging to debug, clear the current logfile, then start your resume job (hoping it's not expired yet on Glacier side) and then post the entire log of the resume attempt here. I should make it even more verbose than it is now, but it may give a clue anyway.

wvmarle commented 11 years ago

For your first error message: that's basically saying it can't find the server on the other side (usually a DNS issue). It's strange you get this error; that should never happen.

SitronNO commented 11 years ago

I have now cleared the log, started a upload. It failed like it did two days ago, then I tried to list the jobs to find a uploadid, but I was not able to find it. Then I tried to upload again, with resume.

This is the logfile for the three above operations, I have just removed some userinfo, some hundred lines of the counter and inserted the exact commands I was running and the resulting debug-log:

**
** Cmd: glacier-cmd upload Pictures "/opt/amazonglacier/holding/2008.03.23 - Paasken paa Hoels hytte.tar" --description "2008.03.23 - Paasken paa Hoels hytte
**

Oct 23 22:14:41 DEBUG    glacier-cmd Validating region.
Oct 23 22:14:41 DEBUG    glacier-cmd Region is valid.
Oct 23 22:14:41 DEBUG    glacier-cmd True
Oct 23 22:14:41 DEBUG    glacier-cmd Creating GlacierWrapper instance with
    aws_access_key=A**removed**A,
    aws_secret_key=7**removed**u,
    bookkeeping='True',
    bookkeeping_domain_name=SysRq,
    region=eu-west-1,
    logfile /home/sitron/.glacier-cmd.log,
    loglevel DEBUG,
    logging to stdout False.
Oct 23 22:14:41 DEBUG    glacier-cmd Connecting to Amazon Glacier.
Oct 23 22:14:41 DEBUG    glacier-cmd Connecting to Amazon Glacier with 
   aws_access_key A**removed**A
   aws_secret_key 7**removed**u
   region eu-west-1
Oct 23 22:14:41 DEBUG    glacier-cmd Connecting to Amazon SimpleDB.
Oct 23 22:14:41 DEBUG    glacier-cmd Connecting to Amazon SimpleDB domain SysRq with
    naws_access_key A**removed**A
    naws_secret_key 7**removed**u
Oct 23 22:14:41 DEBUG    glacier-cmd Method: GET
Oct 23 22:14:41 DEBUG    glacier-cmd Path: /
Oct 23 22:14:41 DEBUG    glacier-cmd Data: 
Oct 23 22:14:41 DEBUG    glacier-cmd Headers: {}
Oct 23 22:14:41 DEBUG    glacier-cmd Host: sdb.amazonaws.com
Oct 23 22:14:41 DEBUG    glacier-cmd establishing HTTPS connection: host=sdb.amazonaws.com, kwargs={}
Oct 23 22:14:41 DEBUG    glacier-cmd Token: None
Oct 23 22:14:41 DEBUG    glacier-cmd using _calc_signature_2
Oct 23 22:14:41 DEBUG    glacier-cmd query string: AWSAccessKeyId=A**removed**A&Action=Select&SelectExpression=select%20%2A%20from%20%60SysRq%60%20limit%201&SignatureMethod=HmacSHA256&SignatureVersion=2&Timestamp=2012-10-23T20%3A14%3A41Z&Version=2009-04-15
Oct 23 22:14:41 DEBUG    glacier-cmd string_to_sign: GET
sdb.amazonaws.com
/
AWSAccessKeyId=A**removed**A&Action=Select&SelectExpression=select%20%2A%20from%20%60SysRq%60%20limit%201&SignatureMethod=HmacSHA256&SignatureVersion=2&Timestamp=2012-10-23T20%3A14%3A41Z&Version=2009-04-15
Oct 23 22:14:41 DEBUG    glacier-cmd len(b64)=44
Oct 23 22:14:41 DEBUG    glacier-cmd base64 encoded digest: 7M4/1JlJdLRI/PEemr8jbQIGS2O7hD+OB9qEnyDufc4=
Oct 23 22:14:41 DEBUG    glacier-cmd query_string: AWSAccessKeyId=A**removed**A&Action=Select&SelectExpression=select%20%2A%20from%20%60SysRq%60%20limit%201&SignatureMethod=HmacSHA256&SignatureVersion=2&Timestamp=2012-10-23T20%3A14%3A41Z&Version=2009-04-15 Signature: 7M4/1JlJdLRI/PEemr8jbQIGS2O7hD+OB9qEnyDufc4=
Oct 23 22:14:42 DEBUG    glacier-cmd wrapping ssl socket; CA certificate file=/usr/local/lib/python2.6/dist-packages/boto/cacerts/cacerts.txt
Oct 23 22:14:42 DEBUG    glacier-cmd validating server certificate: hostname=sdb.amazonaws.com, certificate hosts=[u'sdb.amazonaws.com']
Oct 23 22:14:42 DEBUG    glacier-cmd <?xml version="1.0"?>
<SelectResponse xmlns="http://sdb.amazonaws.com/doc/2009-04-15/"><SelectResult><Item><Name>-6Xs_6Z8xC_3BxU17kGOAMhwv_AosIwqRuUUbP-sO3HCXOxdK--j0KRXYkd3pvZqcmQ2RmSIPDUBWfQxvwi5mxQD7qaHYVLoLJh_VZIOKfpiMpS2XqAwSpWlvX1i3zJeo1P8jD1o4g</Name><Attribute><Name>region</Name><Value>eu-west-1</Value></Attribute><Attribute><Name>hash</Name><Value>64852c3141de9ce89fad87ff61ef68f8b4b4c1d7b9c7b2ec9dbf9727318416e1</Value></Attribute><Attribute><Name>description</Name><Value>2007.05.11 - Leiligheten</Value></Attribute><Attribute><Name>archive_id</Name><Value>-6Xs_6Z8xC_3BxU17kGOAMhwv_AosIwqRuUUbP-sO3HCXOxdK--j0KRXYkd3pvZqcmQ2RmSIPDUBWfQxvwi5mxQD7qaHYVLoLJh_VZIOKfpiMpS2XqAwSpWlvX1i3zJeo1P8jD1o4g</Value></Attribute><Attribute><Name>vault</Name><Value>Pictures</Value></Attribute><Attribute><Name>date</Name><Value>2012-10-21 15:18:52+00:00</Value></Attribute><Attribute><Name>size</Name><Value>82173557</Value></Attribute></Item><NextToken>rO0ABXNyACdjb20uYW1hem9uLnNkcy5RdWVyeVByb2Nlc3Nvci5Nb3JlVG9rZW7racXLnINNqwMA
C0kAFGluaXRpYWxDb25qdW5jdEluZGV4WgAOaXNQYWdlQm91bmRhcnlKAAxsYXN0RW50aXR5SURa
AApscnFFbmFibGVkSQAPcXVlcnlDb21wbGV4aXR5SgATcXVlcnlTdHJpbmdDaGVja3N1bUkACnVu
aW9uSW5kZXhaAA11c2VRdWVyeUluZGV4TAANY29uc2lzdGVudExTTnQAEkxqYXZhL2xhbmcvU3Ry
aW5nO0wAEmxhc3RBdHRyaWJ1dGVWYWx1ZXEAfgABTAAJc29ydE9yZGVydAAvTGNvbS9hbWF6b24v
c2RzL1F1ZXJ5UHJvY2Vzc29yL1F1ZXJ5JFNvcnRPcmRlcjt4cAAAAAAAAAAAAAAAAB0AAAAAAAAA
AAAAAAAAAAAAAABwcHB4</NextToken></SelectResult><ResponseMetadata><RequestId>12129d05-2f25-b224-806e-e8e1c8ce88ca</RequestId><BoxUsage>0.0000228616</BoxUsage></ResponseMetadata></SelectResponse>
Oct 23 22:14:42 DEBUG    glacier-cmd Uploading archive.
Oct 23 22:14:42 DEBUG    glacier-cmd Checking whether vault name is valid.
Oct 23 22:14:42 DEBUG    glacier-cmd Vault name is valid.
Oct 23 22:14:42 DEBUG    glacier-cmd True
Oct 23 22:14:42 DEBUG    glacier-cmd Validating region.
Oct 23 22:14:42 DEBUG    glacier-cmd Region is valid.
Oct 23 22:14:42 DEBUG    glacier-cmd True
Oct 23 22:14:42 DEBUG    glacier-cmd Checking whether vault description is valid.
Oct 23 22:14:42 DEBUG    glacier-cmd Vault description is valid.
Oct 23 22:14:42 DEBUG    glacier-cmd True
Oct 23 22:14:42 INFO     glacier-cmd Starting upload of /opt/amazonglacier/holding/2008.03.23 - Paasken paa Hoels hytte.tar to Pictures.
Description: 2008.03.23 - Paasken paa Hoels hytte
Oct 23 22:14:46 DEBUG    glacier-cmd Wrote 1.0 MB of 1.2 GB (ESC[1m0ESC[0m%). Rate 354.03 KB/s, average 354.03 KB/s, eta 23:12:07.
Oct 23 22:14:48 DEBUG    glacier-cmd Wrote 2.0 MB of 1.2 GB (ESC[1m0ESC[0m%). Rate 381.07 KB/s, average 367.06 KB/s, eta 23:10:04.
Oct 23 22:14:51 DEBUG    glacier-cmd Wrote 3.0 MB of 1.2 GB (ESC[1m0ESC[0m%). Rate 331.67 KB/s, average 354.45 KB/s, eta 23:12:02.

**removed**

Oct 23 23:09:22 DEBUG    glacier-cmd Wrote 1014.0 MB of 1.2 GB (ESC[1m85ESC[0m%). Rate 272.47 KB/s, average 316.60 KB/s, eta 23:18:54.
Oct 23 23:09:25 DEBUG    glacier-cmd Wrote 1015.0 MB of 1.2 GB (ESC[1m85ESC[0m%). Rate 379.39 KB/s, average 316.65 KB/s, eta 23:18:53.
Oct 23 23:09:28 DEBUG    glacier-cmd Wrote 1016.0 MB of 1.2 GB (ESC[1m85ESC[0m%). Rate 296.61 KB/s, average 316.63 KB/s, eta 23:18:53.

**
** glacier-cmd listjobs Pictures
**
Oct 23 23:15:59 DEBUG    glacier-cmd Validating region.
Oct 23 23:15:59 DEBUG    glacier-cmd Region is valid.
Oct 23 23:15:59 DEBUG    glacier-cmd True
Oct 23 23:15:59 DEBUG    glacier-cmd Creating GlacierWrapper instance with
    aws_access_key=A**removed**A,
    aws_secret_key=7**removed**u,
    bookkeeping='True',
    bookkeeping_domain_name=SysRq,
    region=eu-west-1,
    logfile /home/sitron/.glacier-cmd.log,
    loglevel DEBUG,
    logging to stdout False.
Oct 23 23:15:59 DEBUG    glacier-cmd Connecting to Amazon Glacier.
Oct 23 23:15:59 DEBUG    glacier-cmd Connecting to Amazon Glacier with 
   aws_access_key A**removed**A
   aws_secret_key 7**removed**u
   region eu-west-1
Oct 23 23:15:59 DEBUG    glacier-cmd Requesting jobs list.
Oct 23 23:15:59 DEBUG    glacier-cmd Checking whether vault name is valid.
Oct 23 23:15:59 DEBUG    glacier-cmd Vault name is valid.
Oct 23 23:15:59 DEBUG    glacier-cmd True
Oct 23 23:15:59 DEBUG    glacier-cmd Method: GET
Oct 23 23:15:59 DEBUG    glacier-cmd Path: /-/vaults/Pictures/jobs
Oct 23 23:15:59 DEBUG    glacier-cmd Data: 
Oct 23 23:15:59 DEBUG    glacier-cmd Headers: {'x-amz-glacier-version': '2012-06-01'}
Oct 23 23:15:59 DEBUG    glacier-cmd Host: glacier.eu-west-1.amazonaws.com
Oct 23 23:15:59 DEBUG    glacier-cmd establishing HTTPS connection: host=glacier.eu-west-1.amazonaws.com, kwargs={}
Oct 23 23:15:59 DEBUG    glacier-cmd Token: None
Oct 23 23:15:59 DEBUG    glacier-cmd CanonicalRequest:
GET
/-/vaults/Pictures/jobs

host:glacier.eu-west-1.amazonaws.com
x-amz-date:20121023T211559Z
x-amz-glacier-version:2012-06-01

host;x-amz-date;x-amz-glacier-version
e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
Oct 23 23:15:59 DEBUG    glacier-cmd StringToSign:
AWS4-HMAC-SHA256
20121023T211559Z
20121023/eu-west-1/glacier/aws4_request
f71f51ea1eea72d3bccec13d6bca7de8753c99539de722a656cfdd80c5821a36
Oct 23 23:15:59 DEBUG    glacier-cmd Signature:
b8d7c0a55d1fb5dfdd94b67da0482f9562db394eabe76804edc1993a401d04a6
Oct 23 23:15:59 DEBUG    glacier-cmd wrapping ssl socket; CA certificate file=/usr/local/lib/python2.6/dist-packages/boto/cacerts/cacerts.txt
Oct 23 23:15:59 DEBUG    glacier-cmd validating server certificate: hostname=glacier.eu-west-1.amazonaws.com, certificate hosts=[u'glacier.eu-west-1.amazonaws.com']
Oct 23 23:15:59 DEBUG    glacier-cmd Active jobs list received.
Oct 23 23:15:59 DEBUG    glacier-cmd [{u'Action': u'InventoryRetrieval',
  u'ArchiveId': None,
  u'ArchiveSizeInBytes': None,
  u'Completed': True,
  u'CompletionDate': u'2012-10-23T11:36:51.502Z',
  u'CreationDate': u'2012-10-23T07:36:39.355Z',
  u'InventorySizeInBytes': 9534,
  u'JobDescription': None,
  u'JobId': u'vcpbDgS18mTNoy4WCd1p3nBY_miOZJbCvUJC8-pgwnyIiO9AI5FTnNgH5FLJeKWSaCumskHGKrp7OnG16XM66Cc2DNQ5',
  u'SHA256TreeHash': None,
  u'SNSTopic': None,
  u'StatusCode': u'Succeeded',
  u'StatusMessage': u'Succeeded',
  u'VaultARN': u'arn:aws:glacier:eu-west-1:337797876811:vaults/Pictures'}]
Oct 23 23:15:59 DEBUG    glacier-cmd Connection to Amazon Glacier successful.
Oct 23 23:15:59 DEBUG    glacier-cmd [{u'Action': u'InventoryRetrieval',
  u'ArchiveId': None,
  u'ArchiveSizeInBytes': None,
  u'Completed': True,
  u'CompletionDate': u'2012-10-23T11:36:51.502Z',
  u'CreationDate': u'2012-10-23T07:36:39.355Z',
  u'InventorySizeInBytes': 9534,
  u'JobDescription': None,
  u'JobId': u'vcpbDgS18mTNoy4WCd1p3nBY_miOZJbCvUJC8-pgwnyIiO9AI5FTnNgH5FLJeKWSaCumskHGKrp7OnG16XM66Cc2DNQ5',
  u'SHA256TreeHash': None,
  u'SNSTopic': None,
  u'StatusCode': u'Succeeded',
  u'StatusMessage': u'Succeeded',
  u'VaultARN': u'arn:aws:glacier:eu-west-1:337797876811:vaults/Pictures'}]

***
*** glacier-cmd upload Pictures "/opt/amazonglacier/holding/2008.03.23 - Paasken paa Hoels hytte.tar" --description "2008.03.23 - Paasken paa Hoels hytte" --resume
***
Oct 23 23:23:40 DEBUG    glacier-cmd Validating region.
Oct 23 23:23:40 DEBUG    glacier-cmd Region is valid.
Oct 23 23:23:40 DEBUG    glacier-cmd True
Oct 23 23:23:40 DEBUG    glacier-cmd Creating GlacierWrapper instance with
    aws_access_key=A**removed**A,
    aws_secret_key=7**removed**u,
    bookkeeping='True',
    bookkeeping_domain_name=SysRq,
    region=eu-west-1,
    logfile /home/sitron/.glacier-cmd.log,
    loglevel DEBUG,
    logging to stdout False.
Oct 23 23:23:40 DEBUG    glacier-cmd Connecting to Amazon Glacier.
Oct 23 23:23:40 DEBUG    glacier-cmd Connecting to Amazon Glacier with 
   aws_access_key A**removed**A
   aws_secret_key 7**removed**u
   region eu-west-1
Oct 23 23:23:40 DEBUG    glacier-cmd Connecting to Amazon SimpleDB.
Oct 23 23:23:40 DEBUG    glacier-cmd Connecting to Amazon SimpleDB domain SysRq with
    naws_access_key A**removed**A
    naws_secret_key 7**removed**u
Oct 23 23:23:40 DEBUG    glacier-cmd Method: GET
Oct 23 23:23:40 DEBUG    glacier-cmd Path: /
Oct 23 23:23:40 DEBUG    glacier-cmd Data: 
Oct 23 23:23:40 DEBUG    glacier-cmd Headers: {}
Oct 23 23:23:40 DEBUG    glacier-cmd Host: sdb.amazonaws.com
Oct 23 23:23:40 DEBUG    glacier-cmd establishing HTTPS connection: host=sdb.amazonaws.com, kwargs={}
Oct 23 23:23:40 DEBUG    glacier-cmd Token: None
Oct 23 23:23:40 DEBUG    glacier-cmd using _calc_signature_2
Oct 23 23:23:40 DEBUG    glacier-cmd query string: AWSAccessKeyId=A**removed**A&Action=Select&SelectExpression=select%20%2A%20from%20%60SysRq%60%20limit%201&SignatureMethod=HmacSHA256&SignatureVersion=2&Timestamp=2012-10-23T21%3A23%3A40Z&Version=2009-04-15
Oct 23 23:23:40 DEBUG    glacier-cmd string_to_sign: GET
sdb.amazonaws.com
/
AWSAccessKeyId=A**removed**A&Action=Select&SelectExpression=select%20%2A%20from%20%60SysRq%60%20limit%201&SignatureMethod=HmacSHA256&SignatureVersion=2&Timestamp=2012-10-23T21%3A23%3A40Z&Version=2009-04-15
Oct 23 23:23:40 DEBUG    glacier-cmd len(b64)=44
Oct 23 23:23:40 DEBUG    glacier-cmd base64 encoded digest: 1EVaJt+rY1Emw+twhJ9kYyrSgqXZgQnF4qrSK8WvccE=
Oct 23 23:23:40 DEBUG    glacier-cmd query_string: AWSAccessKeyId=A**removed**A&Action=Select&SelectExpression=select%20%2A%20from%20%60SysRq%60%20limit%201&SignatureMethod=HmacSHA256&SignatureVersion=2&Timestamp=2012-10-23T21%3A23%3A40Z&Version=2009-04-15 Signature: 1EVaJt+rY1Emw+twhJ9kYyrSgqXZgQnF4qrSK8WvccE=
Oct 23 23:23:41 DEBUG    glacier-cmd wrapping ssl socket; CA certificate file=/usr/local/lib/python2.6/dist-packages/boto/cacerts/cacerts.txt
Oct 23 23:23:41 DEBUG    glacier-cmd validating server certificate: hostname=sdb.amazonaws.com, certificate hosts=[u'sdb.amazonaws.com']
Oct 23 23:23:41 DEBUG    glacier-cmd <?xml version="1.0"?>
<SelectResponse xmlns="http://sdb.amazonaws.com/doc/2009-04-15/"><SelectResult><Item><Name>-6Xs_6Z8xC_3BxU17kGOAMhwv_AosIwqRuUUbP-sO3HCXOxdK--j0KRXYkd3pvZqcmQ2RmSIPDUBWfQxvwi5mxQD7qaHYVLoLJh_VZIOKfpiMpS2XqAwSpWlvX1i3zJeo1P8jD1o4g</Name><Attribute><Name>region</Name><Value>eu-west-1</Value></Attribute><Attribute><Name>hash</Name><Value>64852c3141de9ce89fad87ff61ef68f8b4b4c1d7b9c7b2ec9dbf9727318416e1</Value></Attribute><Attribute><Name>description</Name><Value>2007.05.11 - Leiligheten</Value></Attribute><Attribute><Name>archive_id</Name><Value>-6Xs_6Z8xC_3BxU17kGOAMhwv_AosIwqRuUUbP-sO3HCXOxdK--j0KRXYkd3pvZqcmQ2RmSIPDUBWfQxvwi5mxQD7qaHYVLoLJh_VZIOKfpiMpS2XqAwSpWlvX1i3zJeo1P8jD1o4g</Value></Attribute><Attribute><Name>vault</Name><Value>Pictures</Value></Attribute><Attribute><Name>date</Name><Value>2012-10-21 15:18:52+00:00</Value></Attribute><Attribute><Name>size</Name><Value>82173557</Value></Attribute></Item><NextToken>rO0ABXNyACdjb20uYW1hem9uLnNkcy5RdWVyeVByb2Nlc3Nvci5Nb3JlVG9rZW7racXLnINNqwMA
C0kAFGluaXRpYWxDb25qdW5jdEluZGV4WgAOaXNQYWdlQm91bmRhcnlKAAxsYXN0RW50aXR5SURa
AApscnFFbmFibGVkSQAPcXVlcnlDb21wbGV4aXR5SgATcXVlcnlTdHJpbmdDaGVja3N1bUkACnVu
aW9uSW5kZXhaAA11c2VRdWVyeUluZGV4TAANY29uc2lzdGVudExTTnQAEkxqYXZhL2xhbmcvU3Ry
aW5nO0wAEmxhc3RBdHRyaWJ1dGVWYWx1ZXEAfgABTAAJc29ydE9yZGVydAAvTGNvbS9hbWF6b24v
c2RzL1F1ZXJ5UHJvY2Vzc29yL1F1ZXJ5JFNvcnRPcmRlcjt4cAAAAAAAAAAAAAAAAB0AAAAAAAAA
AAAAAAAAAAAAAABwcHB4</NextToken></SelectResult><ResponseMetadata><RequestId>313a6da5-225c-2662-ff2f-655b80b28b0b</RequestId><BoxUsage>0.0000228616</BoxUsage></ResponseMetadata></SelectResponse>
Oct 23 23:23:41 DEBUG    glacier-cmd Uploading archive.
Oct 23 23:23:41 DEBUG    glacier-cmd Checking whether vault name is valid.
Oct 23 23:23:41 DEBUG    glacier-cmd Vault name is valid.
Oct 23 23:23:41 DEBUG    glacier-cmd True
Oct 23 23:23:41 DEBUG    glacier-cmd Validating region.
Oct 23 23:23:41 DEBUG    glacier-cmd Region is valid.
Oct 23 23:23:41 DEBUG    glacier-cmd True
Oct 23 23:23:41 DEBUG    glacier-cmd Checking whether vault description is valid.
Oct 23 23:23:41 DEBUG    glacier-cmd Vault description is valid.
Oct 23 23:23:41 DEBUG    glacier-cmd True
Oct 23 23:23:41 INFO     glacier-cmd Attempting resumption of upload of /opt/amazonglacier/holding/2008.03.23 - Paasken paa Hoels hytte.tar to Pictures.
Oct 23 23:23:44 DEBUG    glacier-cmd Wrote 1.0 MB of 1.2 GB (ESC[1m0ESC[0m%). Rate 388.72 KB/s, average 388.72 KB/s, eta 00:15:58.
Oct 23 23:23:47 DEBUG    glacier-cmd Wrote 2.0 MB of 1.2 GB (ESC[1m0ESC[0m%). Rate 329.28 KB/s, average 356.54 KB/s, eta 00:20:41.
Oct 23 23:23:50 DEBUG    glacier-cmd Wrote 3.0 MB of 1.2 GB (ESC[1m0ESC[0m%). Rate 328.51 KB/s, average 346.68 KB/s, eta 00:22:18.
Oct 23 23:23:53 DEBUG    glacier-cmd Wrote 4.0 MB of 1.2 GB (ESC[1m0ESC[0m%). Rate 309.39 KB/s, average 336.54 KB/s, eta 00:24:04.
Oct 23 23:23:57 DEBUG    glacier-cmd Wrote 5.0 MB of 1.2 GB (ESC[1m0ESC[0m%). Rate 328.10 KB/s, average 334.82 KB/s, eta 00:24:23.
jp-morvan commented 11 years ago

Hi, I've the same problem and only with files over than 1GB. I tried to reduce them under this limit and it worked

PS : thank you for this powerfull tool !

urandom commented 11 years ago

I've successfully managed to upload files over 5GB without being cut off, by increasing the part size to 64MB

jp-morvan commented 11 years ago

Good trick ! It work for me with archive over 3.5 GB !

sibblegp commented 11 years ago

I am also getting this error with a 2GB file.

sibblegp commented 11 years ago

The partsize trick worked for me. Also sped up uploading by 6x.

wvmarle commented 11 years ago

Thanks for all the input.

This gaierror is something low-level, and I really do not understand why it's raised. The first 1,000-something times the name resolves just fine, and then suddenly it fails. A quick google search on this error makes me suspect it is related to having a proxy server. Is that possible?

Also it seems to be related to a limit in number of parts sent over: from the logs (and related comments) I make up it's always at part 1016 the crash happens, and increasing the part size (i.e. decreasing the number of parts) solves the problem.

So the quick fix is to increase the part size (the upload speed issue is seriously improved in my latest code, but larger parts will give always you better performance simply because you cut down on time needed to start up a new connection).

Tonight I'll try to reproduce this error myself, see if I can find out anything. I think we should also decrease the maximum number of parts to 1,000 (currently at 10,000 - Glacier's limit) to make it work.

To find the upload id for resumption, you have to use $ glacier-cmd --listmultiparts. Glacier jobs are only for inventory retrieval or for archive retrieval.

wvmarle commented 11 years ago

Just doing some testing and the first thing I noticed is that there is a bug in the upload resumption code. For some reason the hash check fails; and I'm not going to fix it in that branch.

Checkout my parallel_uploads branch for the latest code - upload resumption works correctly there. Whatever the bug is (probably caused by merging and solving conflicts the wrong way), it's fixed there. Also the --resume switch is implemented, making resumption easier.

https://github.com/wvmarle/amazon-glacier-cmd-interface/tree/parallel_uploads

Currently I have a test upload running, after lunch will see what happened to it. It's an 8.2 GB file using 1 MB parts so it should trigger this 1 GB issue.

wvmarle commented 11 years ago

Testing reveals an issue with resumption where there are more than 50 parts to check. That wasn't tested well indeed.

And I'm again having problems with time-outs on uploads, must start catching and retrying those, too.

SitronNO commented 11 years ago

wvmarle wrote:

A quick google search on this error makes me suspect it is related to having a proxy server. Is that possible?

No proxy server here, so no...

[...] I make up it's always at part 1016 the crash happens [...]

Yes, always at 1016.

To find the upload id for resumption, you have to use $ glacier-cmd --listmultiparts. Glacier jobs are only for inventory retrieval or for archive retrieval.

Ok, this gave me the UploadId, and I tried:

$ glacier-cmd upload Pictures "/opt/amazonglacier/holding/2008.03.23 - Paasken paa Hoels hytte.tar" --description "2008.03.23 - Paasken paa Hoels hytte" --uploadid 35PPeJB8mVmB8V9lYqMGLCERoZe4LWL6sszwz9iunxXMFiusp_g8xNLiZg0KCJAfo1CTOgvl07Ie7bW19g9irSbpZ4NB
start: 0, current position: 0
str: Received data does not match uploaded data; please check your uploadid and try again.
Caused by: 1 exception
||  str: SHA256 hash mismatch.
wvmarle commented 11 years ago

Just uploaded 1.2 GB at 1 MB part size without issues.

urandom commented 11 years ago

Most of the failures I get are after I upload 1013 parts. But some even go as high as 2000. A minor few make it all the way. Still, no proxy is involved.

in your tests, you might want to try with different archives, all of which are bigger tha I'm glad you're able to reproduce the hash problem with resuming. In your other branch, what does --resume do exactly? I didn't seem to need it at all when resuming worked in master

wvmarle commented 11 years ago

That 1016 is awfully close to 1024 - it makes me suspect some kind of overrun. Besides the 1016 parts there has been at least an http request for a Glacier connection, one for a SimpleDB connection, and one to initiate the multipart upload.

--resume is an enhancement of --uploadid <uploadid>. The difference is that it will attempt to get the uploadid from the bookkeeping database by matching on file name and size - the uploadid is stored there as soon as an upload is initiated, this entry is replaced by the archiveid when the upload is finished. For the rest it's identical.

Indeed I will also start testing with bigger files, but that notwithstanding, I wonder what really causes this issue. And how to work around it. It's easy enough to retry it but I doubt it's going to make much of a difference.

wvmarle commented 11 years ago

Resumption >50 parts fixed.

SitronNO commented 11 years ago

I tried the code from https://github.com/wvmarle/amazon-glacier-cmd-interface/tree/parallel_uploads, but it was no different. I am trying with a new file now, and if that fails I can make it public available so you can test and possible reproduce the error.

wvmarle commented 11 years ago

The resumption part should work :-)

The socket.gaierror is not from glacier-cmd or even from boto (the library handling the actual calls to Glacier), it's deeper down. That's why it's so hard to figure out what's going on, and what causes this error.

SitronNO commented 11 years ago

I have now tried with different files (all around 1.1 - 1.2GB in size), and I have also tried from a different client. The same happens, after 1016MB it stops.

This is the error-message:

$ glacier-cmd upload Pictures ./amazon_glacier_testfile.data --description "Random data"
Traceback (most recent call last):ate 1.06 MB/s, average 1.02 MB/s, eta 15:04:30.                                                                                                                                                                   
  File "/usr/local/bin/glacier-cmd", line 9, in <module>
    load_entry_point('glacier==0.2dev', 'console_scripts', 'glacier-cmd')()
  File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/glacier.py", line 751, in main
    args.func(args)
  File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/glacier.py", line 147, in wrapper
    return fn(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/glacier.py", line 300, in upload
    args.name, args.partsize, args.uploadid, args.resume)
  File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 59, in wrapper
    ret = fn(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 194, in glacier_connect_wrap
    return func(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 59, in wrapper
    ret = fn(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 247, in sdb_connect_wrap
    return func(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 59, in wrapper
    ret = fn(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 1050, in upload
    writer.write(part)
  File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/glaciercorecalls.py", line 129, in write
    data)
  File "/usr/local/lib/python2.7/dist-packages/boto/glacier/layer1.py", line 625, in upload_part
    response_headers=response_headers)
  File "/usr/local/lib/python2.7/dist-packages/boto/glacier/layer1.py", line 78, in make_request
    data=data)
  File "/usr/local/lib/python2.7/dist-packages/boto/connection.py", line 910, in make_request
    return self._mexe(http_request, sender, override_num_retries)
  File "/usr/local/lib/python2.7/dist-packages/boto/connection.py", line 872, in _mexe
    raise e
socket.gaierror: [Errno -2] Name or service not known
Error in sys.excepthook:
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/apport_python_hook.py", line 66, in apport_excepthook
    from apport.fileutils import likely_packaged, get_recent_crashes
  File "/usr/lib/python2.7/dist-packages/apport/__init__.py", line 1, in <module>
    from apport.report import Report
ImportError: No module named report

Original exception was:
Traceback (most recent call last):
  File "/usr/local/bin/glacier-cmd", line 9, in <module>
    load_entry_point('glacier==0.2dev', 'console_scripts', 'glacier-cmd')()
  File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/glacier.py", line 751, in main
    args.func(args)
  File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/glacier.py", line 147, in wrapper
    return fn(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/glacier.py", line 300, in upload
    args.name, args.partsize, args.uploadid, args.resume)
  File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 59, in wrapper
    ret = fn(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 194, in glacier_connect_wrap
    return func(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 59, in wrapper
    ret = fn(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 247, in sdb_connect_wrap
    return func(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 59, in wrapper
    ret = fn(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 1050, in upload
    writer.write(part)
  File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/glaciercorecalls.py", line 129, in write
    data)
  File "/usr/local/lib/python2.7/dist-packages/boto/glacier/layer1.py", line 625, in upload_part
    response_headers=response_headers)
  File "/usr/local/lib/python2.7/dist-packages/boto/glacier/layer1.py", line 78, in make_request
    data=data)
  File "/usr/local/lib/python2.7/dist-packages/boto/connection.py", line 910, in make_request
    return self._mexe(http_request, sender, override_num_retries)
  File "/usr/local/lib/python2.7/dist-packages/boto/connection.py", line 872, in _mexe
    raise e
socket.gaierror: [Errno -2] Name or service not known

I am using wvmarle's parallel_uploads-code, and the file was generated like this: $ dd if=/dev/urandom of=amazon_glacier_testfile.data bs=1M count=1150

wvmarle commented 11 years ago

@SiltronNO: different client, you mean another glacier upload client? If that's glacier from the boto package, then you are basically using the same upload stack. We're also using boto for the actual http calls. And the same error message can be expected.

With all these reports I've made a quick fix to my code: reduced number of parts from 10,000 to 1,000. The main effect is that the maximum archive size that can be handled is about 3.7 TB (down from 37 TB), and that for many uploads the part size is increased.

SitronNO commented 11 years ago

@wvmarle: No, I am always using the code from https://github.com/wvmarle/amazon-glacier-cmd-interface/tree/parallel_uploads - Different client meant a different host/machine on a different network. Just to make sure it wasn't a issue with the host.

If you generated the file the same way I did, you did not get an error? Are you also using eu-west-1?

wvmarle commented 11 years ago

@SitronNO: Using encrypted archives to test with - looks like totally random gibberish :-) And using us-east-1 (the default) for host. Error doesn't seem data related; it's a

I may have been lucky that time. The signature errors are my greatest issue now - seemingly randomly sometimes my credentials are rejected. In the middle of an upload.

This is really going to be a tough one to track down; looking at the source of boto may help; as yet still no clue on what causes this issue.

Comments in boto/connections.py talk about a "pool of connections" - it may be that this pool runs out, due to the number of 1016 I'm still looking at such an issue (a 1024 limit on something). Complication is that I must switch off debug logging for boto as it logs the complete body (all 1MB of it, or whatever your part size is!) in the logs. Making debugging even harder.

wvmarle commented 11 years ago

Just had a look at the boto sources, and found out that there is a rather large try/except block so I can't see where exactly this error comes from. It is from somewhere in that block - but it's masked. It is originally raised somewhere between line 805 and 851, caught and handled in 852, then re-raised in 872 and that one is what we see. I'm curious about the traceback before that point, to find out where the error really originates.

So I'm going to edit that one module from boto (disabling the try/except and disabling logging of the body of the request so I can switch on debug logging for boto), but as it seems I can't reproduce the error reliably myself I need someone's help. Basically install the amended module, and trigger the error, that should pinpoint more exactly where it is raised.

Code in question is here: https://github.com/boto/boto/blob/develop/boto/connection.py

SitronNO commented 11 years ago

@wvmarle: Is there a word missing from your sentance?

Error doesn't seem data related; it's a

Anyway, I tried to upload to us-east-1 just to make sure, and I got the same error at 1016. It just hangs for a while, then I get the error-message.

I can try to debug later today, see if I can narrow it down.

SitronNO commented 11 years ago

Here is the file that always fail: https://www.wetransfer.com/dl/PoT43aTQ/137545c73bad101dce2556b88825b7304fdc7ae5a0a423db44e328b33dacd5a424b2c6665f395d3 md5sum: 021897f13a662d6982189374a45748ca

wvmarle commented 11 years ago

Just uploaded a new branch testing.

This includes the file connection.py which has to be installed in /usr/local/lib/python2.7/dist-packages/boto-2.6.0-py2.7.egg/boto (do remember to backup the original connection.py from boto, and restore after testing).

When error is triggered, this should give the stack trace down to where the error really originates.

In the meantime another bug has appeared in this code preventing the proper finish of an upload :-( Gonna squash that one later, first have to get the kiddo to bed. And it doesn't affect this 1016-part-error as it's about closing the upload session.

wvmarle commented 11 years ago

@SitronNO:

That should be something like "Error doesn't seem data related; it's a connection related error, seemingly a DNS lookup."

SitronNO commented 11 years ago

@wvmarle: Just to make sure I use the correct code, since I am not that used to git and github, this is what I have to do:

$ git clone git://github.com/wvmarle/amazon-glacier-cmd-interface.git amazon-glacier-cmd-interface_wvmarle
$ cd amazon-glacier-cmd-interface_wvmarle/
$ git branch -a
$ git checkout -b parallel_uploads remotes/origin/parallel_uploads
$ git pull
$ sudo python setup.py install

This will install glacier-cmd based on your parallel_uploads-code? Or am I doing something wrong?

SitronNO commented 11 years ago

@wvmarle: I have now done some testing, and this is what I did, what worked and what did not work:

$ git clone -b testing git://github.com/wvmarle/amazon-glacier-cmd-interface.git amazon-glacier-cmd-interface_testing
$ cd amazon-glacier-cmd-interface_testing/
$ sudo mv /usr/local/lib/python2.7/dist-packages/boto/connection.py /root/
$ sudo mv connection.py /usr/local/lib/python2.7/dist-packages/boto/
$ sudo python setup.py install

If I understand correctly, I have used glacier-cmd based on your testing-branch, and used your version of connection.py in boto. Correct?

Anyway, that worked! Did not get an error and the file uploaded without a problem.

So I want back to the original code, reinstalled (but did not touch the modified connection.py), and that gave me the old error:

$ glacier-cmd upload Pictures Privat/amazon/amazon_glacier_testfile.data --description "Random data 2"
Traceback (most recent call last):ate 1.00 MB/s, average 940.03 KB/s, eta 13:41:19.                                                                                                                                       
  File "/usr/local/bin/glacier-cmd", line 9, in <module>
  File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/glacier.py", line 751, in main
  File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/glacier.py", line 147, in wrapper
  File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/glacier.py", line 300, in upload
  File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 62, in wrapper
  File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 204, in glacier_connect_wrap
  File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 62, in wrapper
  File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 257, in sdb_connect_wrap
  File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 62, in wrapper
  File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 1086, in upload
  File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/glaciercorecalls.py", line 129, in write
  File "/usr/local/lib/python2.7/dist-packages/boto/glacier/layer1.py", line 625, in upload_part
  File "/usr/local/lib/python2.7/dist-packages/boto/glacier/layer1.py", line 78, in make_request
  File "/usr/local/lib/python2.7/dist-packages/boto/connection.py", line 910, in make_request
  File "/usr/local/lib/python2.7/dist-packages/boto/connection.py", line 813, in _mexe
  File "/usr/lib/python2.7/httplib.py", line 958, in request
  File "/usr/lib/python2.7/httplib.py", line 992, in _send_request
  File "/usr/lib/python2.7/httplib.py", line 954, in endheaders
  File "/usr/lib/python2.7/httplib.py", line 814, in _send_output
  File "/usr/lib/python2.7/httplib.py", line 776, in send
  File "/usr/local/lib/python2.7/dist-packages/boto/https_connection.py", line 109, in connect
  File "/usr/lib/python2.7/socket.py", line 224, in meth
socket.gaierror: [Errno -2] Name or service not known
Error in sys.excepthook:
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/apport_python_hook.py", line 66, in apport_excepthook
ImportError: No module named fileutils

Original exception was:
Traceback (most recent call last):
  File "/usr/local/bin/glacier-cmd", line 9, in <module>
  File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/glacier.py", line 751, in main
  File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/glacier.py", line 147, in wrapper
  File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/glacier.py", line 300, in upload
  File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 62, in wrapper
  File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 204, in glacier_connect_wrap
  File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 62, in wrapper
  File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 257, in sdb_connect_wrap
  File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 62, in wrapper
  File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 1086, in upload
  File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/glaciercorecalls.py", line 129, in write
  File "/usr/local/lib/python2.7/dist-packages/boto/glacier/layer1.py", line 625, in upload_part
  File "/usr/local/lib/python2.7/dist-packages/boto/glacier/layer1.py", line 78, in make_request
  File "/usr/local/lib/python2.7/dist-packages/boto/connection.py", line 910, in make_request
  File "/usr/local/lib/python2.7/dist-packages/boto/connection.py", line 813, in _mexe
  File "/usr/lib/python2.7/httplib.py", line 958, in request
  File "/usr/lib/python2.7/httplib.py", line 992, in _send_request
  File "/usr/lib/python2.7/httplib.py", line 954, in endheaders
  File "/usr/lib/python2.7/httplib.py", line 814, in _send_output
  File "/usr/lib/python2.7/httplib.py", line 776, in send
  File "/usr/local/lib/python2.7/dist-packages/boto/https_connection.py", line 109, in connect
  File "/usr/lib/python2.7/socket.py", line 224, in meth
socket.gaierror: [Errno -2] Name or service not known
SitronNO commented 11 years ago

@wvmarle: After understanding how Git works, I have now tried the parallell_uploads-branch too, and yes, that works also.

So I am very sorry! I have, for the last few days, believed and said I used the code from https://github.com/wvmarle/amazon-glacier-cmd-interface/tree/parallel_uploads (which works) when I was really using git://github.com/wvmarle/amazon-glacier-cmd-interface.git (which does not work)

offlinehacker commented 11 years ago

Woops... Nice to know that one is solved :) So do we need development branch of boto or not, so i can update setup.py?

On Fri, Oct 26, 2012 at 2:20 PM, Vidar Hoel notifications@github.comwrote:

@wvmarle https://github.com/wvmarle: After understanding how Git works, I have now tried the parallell_uploads-branch too, and yes, that works also.

So I am very sorry! I have, for the last few days, believed and said I used the code from https://github.com/wvmarle/amazon-glacier-cmd-interface/tree/parallel_uploads(which works) when I was really using git:// github.com/wvmarle/amazon-glacier-cmd-interface.git (which does not work)

— Reply to this email directly or view it on GitHubhttps://github.com/uskudnik/amazon-glacier-cmd-interface/issues/90#issuecomment-9811102.

wvmarle commented 11 years ago

@SitronNO: Very interesting. And I have a strong suspicion: the response.read() calls.

Basically: we call boto to do the upload of the part, and boto returns a response object for that. As we don't do anything with this response, a few revisions ago I basically cleaned it up. So response = boto.upload_part(*args) followed by a result = response.read() became simply boto.upload_part(*args).

Then a bit later I realised issues with uploading: very long time in between parts. Digging through old revisions I finally found that this slowed down the code, and reinstated it. Now it read()s the response object, and everything is faster. It is still not doing anything with the result (which is simply discarded) but it seems that the read() call cleans up the object, and releases the connection to the pool (reading boto code, they have something like a pool of connections).

This would explain the crash, and why it happens just before 1024.

So also in the parallel_uploads branch it should work -albeit in that branch I changed the max_parts to 1,000 to stay within the critical limit.

wvmarle commented 11 years ago

@offlinehacker: for a while I'm using boto 2.6.0 release, and it works fine. That's the first release that includes the Glacier code. I've also edited the README to reflect this - that you have to use boto 2.6.0 or later. Rather use a release version than a development version.

SitronNO commented 11 years ago

@offlinehacker: It works with standard (= not modified version of) boto.

SitronNO commented 11 years ago

@wvmarle:

So also in the parallel_uploads branch it should work -albeit in that branch I changed the max_parts to 1,000 to stay within the critical limit.

Yes, I tried the parallel_uploads branch, and it does work when I modified glacier/GlacierWrapper.py to:

MAX_PARTS = 10000
#MAX_PART = 1000