uskudnik / amazon-glacier-cmd-interface

Command line interface for Amazon Glacier
MIT License
375 stars 103 forks source link

Error while uploading large files to glacier #152

Open deodharsuruchi opened 10 years ago

deodharsuruchi commented 10 years ago

I am uploading several files to glacier using this python boto based cmd tool. It works fine while uploading most of the small files. However while uploading large files (around 3GB), I am getting the following error. I have set the partsize to 4GB since I dont want several part file uploads for a single archive file thus increasing by request costs of glacier. I use the following command to upload.


glacier-cmd upload --partsize 4096 my_vault_name my_file_name.tar


The error trace is as follows. Is there a way to avoid this issue? Please let me know. Thanks in advance!


Traceback (most recent call last):
File "/usr/local/bin/glacier-cmd", line 10, in load_entry_point('glacier==0.2dev', 'console_scripts', 'glacier-cmd')() File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/glacier.py", line 929, in main args.func(args) File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/glacier.py", line 156, in wrapper return fn(_args, _kwargs) File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/glacier.py", line 309, in upload args.name, args.partsize, args.uploadid, args.resume) File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 65, in wrapper ret = fn(_args, _kwargs) File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 231, in glacier_conne ct_wrap return func(_args, _kwargs) File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 65, in wrapper ret = fn(_args, _kwargs) File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 287, in sdb_connect_w rap return func(_args, _kwargs) File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 65, in wrapper ret = fn(_args, _kwargs) File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 1156, in upload writer.write(part) File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/glaciercorecalls.py", line 129, in write data) File "/usr/local/lib/python2.7/dist-packages/boto/glacier/layer1.py", line 637, in upload_part response_headers=response_headers) File "/usr/local/lib/python2.7/dist-packages/boto/glacier/layer1.py", line 79, in make_request data=data) File "/usr/local/lib/python2.7/dist-packages/boto/connection.py", line 966, in make_request retry_handler=retry_handler) File "/usr/local/lib/python2.7/dist-packages/boto/connection.py", line 863, in _mexe request.body, request.headers)
File "/usr/lib/python2.7/httplib.py", line 958, in request self._send_request(method, url, body, headers) File "/usr/lib/python2.7/httplib.py", line 992, in _send_request self.endheaders(body) File "/usr/lib/python2.7/httplib.py", line 954, in endheaders self._send_output(message_body) File "/usr/lib/python2.7/httplib.py", line 812, in _send_output msg += message_body MemoryError

wvmarle commented 10 years ago

In the last line msg += message_body the whole part is being stored in memory - so the complete 3 GB of your file, and it seems there is not enough memory available to do this. This is in httplib, a standard python library, so no workaround possible for this issue without hacking python itself.

Try using a smaller part size - it will not slow down your upload unless you go for really small parts. Try something like 1024 MB. Main advantage of using smaller parts is that if the transmission fails, you only have to resend that part. And the progress bar is updated only upon completing a part.

deodharsuruchi commented 10 years ago

Thank you for your reply. I tried with part size of 1024 MB and it worked for most of my large files greater than 2.5GB. I also encountered another error while uploading some files. The error trace is attached below. It mentions Input/output error. I have verified that the files do exist on the filesystem and the error persists even when there is a single job running. The file size is approximately 4GB. I also tried with smaller part size but it did not resolve the issue.


Traceback (most recent call last):17.66 MB/s, average 35.73 KB/s, ETA Fri, 13 Dec, 16:28:28. File "/usr/local/bin/glacier-cmd", line 10, in load_entry_point('glacier==0.2dev', 'console_scripts', 'glacier-cmd')() File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/glacier.py", line 929, in main args.func(args) File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/glacier.py", line 156, in wrapper return fn(_args, _kwargs) File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/glacier.py", line 309, in upload args.name, args.partsize, args.uploadid, args.resume) File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 65, in wrapper ret = fn(_args, _kwargs) File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 231, in glacier_connect_wrap return func(_args, _kwargs) File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 65, in wrapper ret = fn(_args, _kwargs) File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 287, in sdb_connect_wrap return func(_args, _kwargs) File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 65, in wrapper ret = fn(_args, _kwargs) File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 1149, in upload part = mmapped_file[writer.uploaded_size:writer.uploaded_size+part_size_in_bytes] File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 102, in getitem return self.file.read(stop - key.start) IOError: [Errno 5] Input/output error


Thanks!