Closed cemilbrowne closed 5 years ago
Suprised it isn't supported yet. (I would like to mention that @Backblaze is $0.005/GB 😃 (I don't work for Backblaze) )
+1
+1 thought it is already there, but now that I need it it isn't :(
What would rclone need to do to support this? Set a life cycle policy?
I believe it would be very difficult for rclone to support Glacier, the primary reason being that Glacier does not have the concept of pathnames or filenames: when data is uploaded to Glacier, a unique archive ID is generated by Glacier and that ID is the only way to access the file. On top of that, the 4-hour transaction delay is maddening and the obtuse (that's generous) cost calculations require very careful study. Here's a link to a document detailing why I removed Glacier support from HashBackup:
http://www.hashbackup.com/technical/glacier-eol
Some of this has changed recently as Amazon has made it possible to access Glacier without the 4-hour delay, but many of the other points are still valid. I think when compared to S3 Infrequent Access and Google Nearline, Glacier has very few advantages.
@hashbackup thanks for the writeup - very interesting.
So it looks like S3 Infrequent Access would be the way to go. It looks like that would be easy to add - what do you think?
Yeah, IA is very easy to add: it's just a storage class header on the upload and other S3 operations aren't affected. RRS (Reduced Redundancy Storage) is also a storage class option, but I'd avoid it because according to Amazon, it statistically loses 1 file in 10K per year, and from the way they have it listed on their website, I believe they will eventually deprecate it, like Google is doing with DRA (Durable Reduced Availability).
Other gotchas: S3 IA rounds up every file size to 128K, so using the cost difference between IA and regular S3, you don't want to use IA for any files below 53400 bytes because it will be more expensive than regular S3. There is also a 30-day minimum storage period with IA: if you delete an object before it has been stored 30 days, you're still charged for 30 days. There's not a good way to optimize for this because most of the time the software layer between the storage service and user data doesn't know how long a file will be stored.
Google also has a 30-day delete penalty with Nearline, and with Coldline, it's 90 days. In my opinion, these complex pricing strategies are gimmicks that trick people into paying more in storage costs. For example, Coldline is 3.7x cheaper per month vs regular storage costs, but it could cost 90x more to store a short-lived object if you're not careful.
Google Class A operation costs are also double with Nearline and Coldline, but at least there is no file size minimum as with S3 IA. Class B operations are more than double with Nearline, and 12x more with Coldline. With HashBackup, this isn't a big deal because it is very stingy with operations, but for something like rclone where you are doing a lot of directory listings and transferring a lot of individual files, it could matter.
On 1/27/17, Nick Craig-Wood notifications@github.com wrote:
@hashbackup thanks for the writeup - very interesting.
So it looks like S3 Infrequent Access would be the way to go. It looks like that would be easy to add - what do you think?
-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/ncw/rclone/issues/923#issuecomment-275777038
-- HashBackup: easy onsite and offsite Unix backup http://www.hashbackup.com
I also think glacier would be a good idea, especially since I believe they lowered it to 0.004 except in California then it's the same price as b2 which is 0.005. Not much of a savings by the way, I did a comparison. 13t at b2 is $60 a month the same at glacier is 56. Still though I think this would be a good idea. Any plans to make this happen? I filed a duplicate bug which I will reference, I hope. What's #the status on this? I filed it under 1038
I can't speak to how hard it would be, but the 4 hour delay could be worth it especially as you won't be able to download all 13t in one sitting anyway. It would be another option a user has anyway.
+1 here, I primarily just need some programmatic way to initiate restore (and then 4+ hours later) do a transfer from S3 to elsewhere. Putting a large amount of data in glacier years ago was a terrible idea.
+1 please as backup solution do S3 and others will be amazing
+1
+1
It now works!!
You can either:
create a new S3 remote to systematically use the GLACIER storage class
[s3_remote_glacier]
type = s3
provider = AWS
env_auth = false
access_key_id = xxx
secret_access_key = xxx
region = xx-xxxx-x
endpoint =
location_constraint = xx-xxxx-xx
acl = private
server_side_encryption = AES256
storage_class = GLACIER
then rclone copy or sync and it will immediately be stored using the GLACIER storage class.
or use a remote with a default storage class and override the value at runtime
rclone copy myfile.txt regular_S3_remote:mybucker/myfolder/ --s3-storage-class GLACIER
This is now possible following the update of the S3 PUT API
With the S3 PUT API, you can now upload objects directly to the S3 Glacier storage class without having to manage zero-day lifecycle policies.
I successfully tested both options.
I now hope this also works with the new S3 Glacier Deep Archive storage class once it becomes available in 2019 which should be priced at $0.00099/GB-mo (less than one-tenth of one cent, or $1.01 per TB-mo).
@WilliamCocker great news :-)
Do you fancy sending a pull request to add GLACIER as an option here? You can put "Fixes #923" in it :-)
@ncw Okay, I submitted a pull request as instructed and another one for the documentation but I must warn I'm no GitHub expert so let me know if I should have done something differently.
Thanks for your hard work, rclone rocks!
@ncw Okay, I submitted a pull request as instructed and another one for the documentation but I must warn I'm no GitHub expert so let me know if I should have done something differently.
I don't see your patches - did you click the button in your fork to create the pull request?
Ah I see, you've created the pull requests on your fork... Can you create them on the rclone main repository?
@ncw ok I just did, let me know if there is anything else
@ncw ok I just did, let me know if there is anything else
Perfect - thank you :-)
Hi,
Filing a feature request to support Amazon Glacier. Given the pricing, this seems like an ideal rclone target - $0.007/GB is pretty decent...
Thanks! -Cemil