uskudnik / amazon-glacier-cmd-interface

Command line interface for Amazon Glacier
MIT License
375 stars 103 forks source link

Range retrievals #112

Open wvmarle opened 11 years ago

wvmarle commented 11 years ago

Just now in my mailbox: range retrievals. As soon as upload is finished, it's time to tackle downloads, and that should definitely include this!

Dear Amazon Glacier Customer,

We are pleased to announce support for Range Retrievals. You can now retrieve specific ranges of your Amazon Glacier Archives, providing you even more flexibility to manage and lower your costs.

Amazon Glacier is designed for storing data that is infrequently accessed, and as such, you can retrieve up to 5% of your data, pro-rated daily, for free each month. Range Retrievals can be used to increase your savings on AWS Glacier. In the rare cases where you retrieve more than 5% of your data, you can use Range Retrievals to help you stay within your 5% free retrieval band. For example, you can retrieve a large archive as a series of smaller range retrievals, spread out over longer periods of time, which can reduce or eliminate retrieval fees. Similarly, if you combine multiple files that you upload as a single Amazon Glacier Archive, and you only need to retrieve a selection of those files, you can now retrieve only the range that contains the files you need, which again can reduce or eliminate your retrieval fees.

The introduction of Range Retrievals makes Amazon Glacier – already an extremely low-cost storage service – even more cost-effective for data archiving and backup.

To retrieve a specific range of an archive, you simply provide a byte range when initiating the retrieval job to retrieve your data from Amazon Glacier. Once the job completes, the specified range is then available for you to download. (Read more about jobs in the Amazon Glacier Developer Guide).

We added support for this feature in response to requests from our customers. If there are other features you would like to see us build, please visit our forum and let us know. To learn more about Range Retrievals, visit the Amazon Glacier Developer Guide.

offlinehacker commented 11 years ago

Does that also mean we can do multithreaded downloads? On Nov 15, 2012 11:37 AM, "wvmarle" notifications@github.com wrote:

Just now in my mailbox: range retrievals. As soon as upload is finished, it's time to tackle downloads, and that should definitely include this!

Dear Amazon Glacier Customer,

We are pleased to announce support for Range Retrievals. You can now retrieve specific ranges of your Amazon Glacier Archives, providing you even more flexibility to manage and lower your costs.

Amazon Glacier is designed for storing data that is infrequently accessed, and as such, you can retrieve up to 5% of your data, pro-rated daily, for free each month. Range Retrievals can be used to increase your savings on AWS Glacier. In the rare cases where you retrieve more than 5% of your data, you can use Range Retrievals to help you stay within your 5% free retrieval band. For example, you can retrieve a large archive as a series of smaller range retrievals, spread out over longer periods of time, which can reduce or eliminate retrieval fees. Similarly, if you combine multiple files that you upload as a single Amazon Glacier Archive, and you only need to retrieve a selection of those files, you can now retrieve only the range that contains the files you need, which again can reduce or eliminate your retrieval fees.

The introduction of Range Retrievals makes Amazon Glacier – already an extremely low-cost storage service – even more cost-effective for data archiving and backup.

To retrieve a specific range of an archive, you simply provide a byte range when initiating the retrieval job to retrieve your data from Amazon Glacier. Once the job completes, the specified range is then available for you to download. (Read more about jobs in the Amazon Glacier Developer Guide).

We added support for this feature in response to requests from our customers. If there are other features you would like to see us build, please visit our forum and let us know. To learn more about Range Retrievals, visit the Amazon Glacier Developer Guide.

— Reply to this email directly or view it on GitHubhttps://github.com/uskudnik/amazon-glacier-cmd-interface/issues/112.

uskudnik commented 11 years ago

Hehe, just saw the email and wanted to create an issue :D

Wait, aren't downloads multithreaded by definition? One process per download and off you go?

offlinehacker commented 11 years ago

Looks like i've missed something, but give me an example where range retrivals would be really helpfull? On Nov 15, 2012 12:10 PM, "Urban Škudnik" notifications@github.com wrote:

Hehe, just saw the email and wanted to create an issue :D

Wait, aren't downloads multithreaded by definition? One process per download and off you go?

— Reply to this email directly or view it on GitHubhttps://github.com/uskudnik/amazon-glacier-cmd-interface/issues/112#issuecomment-10405072.

wvmarle commented 11 years ago

no idea where it could be helpful but the idea is cool :-)

Multi-threaded downloads should be possibly by making multiple partial retrieves on the same archive, and then start downloading the separate parts in parallel and reassemble them locally.

jcvi commented 10 years ago

to reduce cost - "When an archive is very large, you may find it cost effective to initiate several sequential jobs to prepare your archive. For example, to retrieve a 1 GB archive, you may choose to send a series of four initiate archive-retrieval job requests, each time requesting Amazon Glacier to prepare only a 256 MB portion of the archive. "

http://docs.aws.amazon.com/amazonglacier/latest/dev/downloading-an-archive.html#downloading-an-archive-range

Are you planning to add the byte range (start,end) to the getarchive and download option ?