leeroybrun / glacier-vault-remove

Remove all archives stored inside an Amazon Glacier vault, even if you have a huge number of them.
379 stars 50 forks source link

Error when running removeVault #3

Closed robmclear closed 10 years ago

robmclear commented 10 years ago

Hi,

Thanks so much for this script. I have a vault with ~750,000 archives and I need to delete them all. I've installed the script and configured it but it is failing early with the following output.

root@domU-12-31-39-09-86-67:/tmp/glacier/glacier-vault-remove# python ./removeVault.py us-east-1 DiskStation_0011321CBE91_1 DEBUG INFO : Logging level set to DEBUG. INFO : Connecting to Amazon Glacier... INFO : Getting selected vault... INFO : Getting jobs list... INFO : Found existing inventory retrieval job... INFO : Inventory retrieved, parsing data... Traceback (most recent call last): File "./removeVault.py", line 80, in inventory = json.loads(job.get_output().read()) File "/usr/lib/python2.7/json/init.py", line 326, in loads return _default_decoder.decode(s) File "/usr/lib/python2.7/json/decoder.py", line 366, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/usr/lib/python2.7/json/decoder.py", line 384, in raw_decode raise ValueError("No JSON object could be decoded")

I really hope I can get this script working! Every other option I've tried to delete this vault's contents has failed so far. Thank you.

leeroybrun commented 10 years ago

Mhh it's very strange. It seems that Amazon doesn't send you a valid JSON object. Have you tried again ?

Can you maybe try with this version : https://github.com/leeroybrun/glacier-vault-remove/blob/test-job-output/removeVault.py It will print the job output to see what's wrong.

robmclear commented 10 years ago

I was wondering if it might have retrieved a vault inventory that was created by another glacier client? Is there any way to force it to create a new inventory?

Will try the alternate version as well.

-Rob

On Mar 9, 2014, at 2:57 PM, Leeroy Brun wrote:

Mhh it's very strange. It seems that Amazon doesn't send you a valid JSON object. Have you tried again ?

Can you maybe try with this version : https://github.com/leeroybrun/glacier-vault-remove/blob/test-job-output/removeVault.py It will print the job output to see what's wrong.


Reply to this email directly or view it on GitHub: https://github.com/leeroybrun/glacier-vault-remove/issues/3#issuecomment-37135053

leeroybrun commented 10 years ago

Yes it's possible that it found an inventory from another client. But the jobs are automatically removed after some time (I don't remember if it's one day, or something else), and then it will request a new one. Maybe retry today or tomorrow and I hope it'll works. :-)

robmclear commented 10 years ago

I will, thank you.

On Mar 9, 2014, at 3:17 PM, Leeroy Brun wrote:

Yes it's possible that it found an inventory from another client. But the jobs are automatically removed after some time (I don't remember if it's one day, or something else), and then it will request a new one. Maybe retry today or tomorrow and I hope it'll works. :-)


Reply to this email directly or view it on GitHub: https://github.com/leeroybrun/glacier-vault-remove/issues/3#issuecomment-37135773

robmclear commented 10 years ago

Making progress… waited 24 hours then tried again. All morning it has been waiting for the new inventory to be available. Now it sees the inventory and starts processing with the following results:

INFO : Logging level set to DEBUG. INFO : Connecting to Amazon Glacier... INFO : Getting selected vault... INFO : Getting jobs list... INFO : Found existing inventory retrieval job... INFO : Inventory retrieved, parsing data... Killed

Any suggestions?

Thanks,

-Rob McLear

On Mar 9, 2014, at 3:17 PM, Leeroy Brun notifications@github.com wrote:

Yes it's possible that it found an inventory from another client. But the jobs are automatically removed after some time (I don't remember if it's one day, or something else), and then it will request a new one. Maybe retry today or tomorrow and I hope it'll works. :-)

— Reply to this email directly or view it on GitHub.

Rob

robmclear@me.com

leeroybrun commented 10 years ago

What do you mean by "Killed" ? Does it kills itself ?

robmclear commented 10 years ago

Yes, the script just ends itself and the output from the terminal session is "Killed"' returning me back to the command line.

Rob

On Mar 12, 2014, at 3:02 AM, Leeroy Brun notifications@github.com wrote:

What do you mean by "Killed" ? Does it kills itself ?

— Reply to this email directly or view it on GitHub.

leeroybrun commented 10 years ago

Have you tried with the main script or with the second one (https://github.com/leeroybrun/glacier-vault-remove/blob/test-job-output/removeVault.py) ?

robmclear commented 10 years ago

The new version fails in a different way:

root@domU-12-31-39-09-86-67:/usr/glacier/glacier-vault-remove# python ./removeVault.py us-east-1 DiskStation_0011321CBE91_1 DEBUG INFO : Logging level set to DEBUG. INFO : Connecting to Amazon Glacier... INFO : Getting selected vault... INFO : Getting jobs list... INFO : Found existing inventory retrieval job... INFO : Inventory retrieved, parsing data... Traceback (most recent call last): File "./removeVault.py", line 80, in print vars(job.get_output().read()) TypeError: vars() argument must have dict attribute root@domU-12-31-39-09-86-67:/usr/glacier/glacier-vault-remove#

Thanks,

-Rob

On Mar 12, 2014, at 8:12 AM, Leeroy Brun notifications@github.com wrote:

Have you tried with the main script or with the second one (https://github.com/leeroybrun/glacier-vault-remove/blob/test-job-output/removeVault.py) ?

— Reply to this email directly or view it on GitHub.

Rob

robmclear@me.com

leeroybrun commented 10 years ago

Ok, and what about this updated version ? https://github.com/leeroybrun/glacier-vault-remove/blob/test-job-output/removeVault.py

robmclear commented 10 years ago

Looks more hopeful… it parsed out the file, repeating the entire contents back through the shell session, then a few minutes later it just outputs:

,"ArchiveDescription":"","CreationDate":"2014-03-02T15:01:37Z","Size":55990,"SHA256TreeHash":"54b346c08f3961bd609f12788f77621a97f64b24721cd3b93282015997b7f6d6"},{"ArchiveId":"qu4bhahyUQfuoRFTgN3TBO3yCUf35_nJLl2mMVOS0wF1HEeFGnoZqDyH225VkzVVGb8dTRIDN3oIUt5_Fw0QkkJsRXa3UHwq1b0pbuB5ancqTdSjnTPVz3sfeTyg5GXAorUmKsc5yw","ArchiveDescription":"","CreationDate":"2014-03-02T15:02:40Z","Size":35988,"SHA256TreeHash":"3928bbe51192301bfdb53a5daa42d6b8931fdd425b6dfb961c604a8f588bab08"}]} Killed

and returns me to the prompt.

-Rob

On Mar 12, 2014, at 11:30 AM, Leeroy Brun wrote:

Ok, and what about this updated version ? https://github.com/leeroybrun/glacier-vault-remove/blob/test-job-output/removeVault.py


Reply to this email directly or view it on GitHub: https://github.com/leeroybrun/glacier-vault-remove/issues/3#issuecomment-37422082

leeroybrun commented 10 years ago

Mhh it's strange... Glacier seems to return a valid JSON. Maybe it's too big... What are your computer specs, OS, Python and Boto version ? I've retried today with a big vault, and it works fine on my computer.

Thanks.

robmclear commented 10 years ago

It was running on a small AWS instance running Ubuntu 12.04LTS, just because I had it available, but I will try it on a 'real' computer and see if I can get different results.

Thanks again!

On Mar 13, 2014, at 3:12 AM, Leeroy Brun notifications@github.com wrote:

Mhh it's strange... Glacier seems to return a valid JSON. Maybe it's too big... What are your computer specs, OS, Python and Boto version ? I've retried today with a big vault, and it works fine on my computer.

Thanks.

— Reply to this email directly or view it on GitHub.

Rob

robmclear@me.com

leeroybrun commented 10 years ago

Ok thanks, let me know if it worked as soon as you tried it.

robmclear commented 10 years ago

Will do… still in the 'Inventory not ready, sleep for 30 minutes...' phase.

-Rob

On Mar 13, 2014, at 10:16 AM, Leeroy Brun notifications@github.com wrote:

Ok thanks, let me know if it worked as soon as you tried it.

— Reply to this email directly or view it on GitHub.

Rob

robmclear@me.com

robmclear commented 10 years ago

I'm seeing this now (again, it's progress!)

iveDescription":"","CreationDate":"2014-03-02T15:00:48Z","Size":49480,"SHA256TreeHash":"52c3668bcaf80b652758185cc6f65d091868cdcae6564a48a35eaab1d2d4ed97"},{"ArchiveId":"WiYIEG3cySBJpemWcVBUk_9LQgqGGUun5WvaXkHx5XJpGnTXHLSiM_poNF6e0Lpb7Fv_oKBuzqeoft1CJWNerQe9YbBC7aT2oillZnK31dI3OlyflVapogwkMqDSFBWgT1_2WFRFfQ","ArchiveDescription":"","CreationDate":"2014-03-02T15:01:37Z","Size":55990,"SHA256TreeHash":"54b346c08f3961bd609f12788f77621a97f64b24721cd3b93282015997b7f6d6"},{"ArchiveId":"qu4bhahyUQfuoRFTgN3TBO3yCUf35_nJLl2mMVOS0wF1HEeFGnoZqDyH225VkzVVGb8dTRIDN3oIUt5_Fw0QkkJsRXa3UHwq1b0pbuB5ancqTdSjnTPVz3sfeTyg5GXAorUmKsc5yw","ArchiveDescription":"","CreationDate":"2014-03-02T15:02:40Z","Size":35988,"SHA256TreeHash":"3928bbe51192301bfdb53a5daa42d6b8931fdd425b6dfb961c604a8f588bab08"}]} INFO : Removing archives... please be patient, this may take some time... ERROR : <class 'socket.gaierror'> INFO : Sleep 2 mins before retrying... INFO : Retry to remove archive ID : Q4pezrLE4cQsomKZMxG67iTBbrIQoKXnygB7JwhsKTmKKi3-uhYJWTR9mtG926_r8jUMHrQA_D-oP8ViZBr3Xmzrv1W5WYf1HxlXWzqMVSOQ_uPuS0Nipfeb63jSVItqi3JVW1T11A ERROR : <class 'socket.gaierror'> INFO : Sleep 2 mins before retrying... INFO : Retry to remove archive ID : B0weSfKTIBsoSdOd4MgHOJE4EhVH0XVA8XRvwteTkbBexRskN2CVWt7eiKD0w0C7ICD8sYkv93lvws2qYNobzmYwohBi_gLLol2wh3eVrCHzI4nrET-qdrlaMxnGIIsPCIVVmt6qCQ ERROR : <class 'socket.gaierror'>

Thanks,

-Rob

On Mar 13, 2014, at 10:16 AM, Leeroy Brun notifications@github.com wrote:

Ok thanks, let me know if it worked as soon as you tried it.

— Reply to this email directly or view it on GitHub.

Rob

robmclear@me.com

leeroybrun commented 10 years ago

It seems that there is a connection error, was your Internet working at this time ? Do you have a firewall or something ?

Can you retry with this one ? It will show more details about the error. https://github.com/leeroybrun/glacier-vault-remove/commits/master/removeVault.py

robmclear commented 10 years ago

Here is the current output:

bash-3.2# python ./removeVault4.py us-east-1 DiskStation_0011321CBE91_1 DEBUG INFO : Logging level set to DEBUG. INFO : Connecting to Amazon Glacier... INFO : Getting selected vault... INFO : Getting jobs list... INFO : Found existing inventory retrieval job... INFO : Inventory retrieved, parsing data... INFO : Removing archives... please be patient, this may take some time... ERROR : [Errno 8] nodename nor servname provided, or not known INFO : Sleep 2 mins before retrying... INFO : Retry to remove archive ID : vvP-xoaBtmfupo0Um0JK8sFlSSvu6EDcjkPt3Jd3nyWHqh0bqEpjNcFFVJauqS00wtwRXX_vbaa_3eiNG6MVLlXNjs5YS9GnolDVX33-6vV2LPeRs57w9vRV_6994rKAO2EgVIg2MQ ERROR : [Errno 8] nodename nor servname provided, or not known INFO : Sleep 2 mins before retrying... INFO : Retry to remove archive ID : 9CIzeQ0GlMr6YdltACugudruBM9LOl-HjvqDYdfZ2wbyxODns1Siqpsj37EQ7QjzXk_agmLGSZ8jXezpRzzJ2adYoGzqRbYb_UjbJrsTSehS7QHO9HdomrFnilEkFr5G8jpyGwZAvg

etc...

Thanks.

-Rob

On Mar 13, 2014, at 2:33 PM, Leeroy Brun notifications@github.com wrote:

It seems that there is a connection error, was your Internet working at this time ? Do you have a firewall or something ?

Can you retry with this one ? It will show more details about the error. https://github.com/leeroybrun/glacier-vault-remove/commits/master/removeVault.py

— Reply to this email directly or view it on GitHub.

Rob

robmclear@me.com

leeroybrun commented 10 years ago

Does it pass some time between the errors ? Or is this direct ?

Because we try to remove as much archives as we can without time between them, so sometimes one fail because maybe of AWS blocking us for some time (typically 1 min or less). Then the script wait and restart the request, and then continue until the next error.

Can you retry with this updated version ? It will show if the removal retry worked or not. https://github.com/leeroybrun/glacier-vault-remove/blob/master/removeVault.py

robmclear commented 10 years ago

Yes, there is a delay between the errors.

I just started the newest version of the script at 10:30 local time. It took about 1 minute to retrieve the inventory and parse the data, then it moved to removing archives.

About two minutes later the first error message appears:

bash-3.2# python ./removeVault5.py us-east-1 DiskStation_0011321CBE91_1 DEBUG INFO : Logging level set to DEBUG. INFO : Connecting to Amazon Glacier... INFO : Getting selected vault... INFO : Getting jobs list... INFO : Found existing inventory retrieval job... INFO : Inventory retrieved, parsing data... INFO : Removing archives... please be patient, this may take some time... ERROR : [Errno 8] nodename nor servname provided, or not known INFO : Sleep 2 mins before retrying...

Thanks again. I will be away for a while so if I don't reply that's why.

-Rob

On Mar 14, 2014, at 3:51 AM, Leeroy Brun notifications@github.com wrote:

Does it pass some time between the errors ? Or is this direct ?

Because we try to remove as much archives as we can without time between them, so sometimes one fail because maybe of AWS blocking us for some time (typically 1 min or less). Then the script wait and restart the request, and then continue until the next error.

Can you retry with this updated version ? It will show if the removal retry worked or not. https://github.com/leeroybrun/glacier-vault-remove/blob/master/removeVault.py

— Reply to this email directly or view it on GitHub.

Rob

robmclear@me.com

robmclear commented 10 years ago

Here's some more: looks like it is making some progress, hard to tell how much.

INFO : Found existing inventory retrieval job... INFO : Inventory retrieved, parsing data... INFO : Removing archives... please be patient, this may take some time... ERROR : [Errno 8] nodename nor servname provided, or not known INFO : Sleep 2 mins before retrying... INFO : Retry to remove archive ID : vvP-xoaBtmfupo0Um0JK8sFlSSvu6EDcjkPt3Jd3nyWHqh0bqEpjNcFFVJauqS00wtwRXX_vbaa_3eiNG6MVLlXNjs5YS9GnolDVX33-6vV2LPeRs57w9vRV_6994rKAO2EgVIg2MQ INFO : Successfully removed archive ID : vvP-xoaBtmfupo0Um0JK8sFlSSvu6EDcjkPt3Jd3nyWHqh0bqEpjNcFFVJauqS00wtwRXX_vbaa_3eiNG6MVLlXNjs5YS9GnolDVX33-6vV2LPeRs57w9vRV_6994rKAO2EgVIg2MQ ERROR : [Errno 8] nodename nor servname provided, or not known INFO : Sleep 2 mins before retrying... INFO : Retry to remove archive ID : rJvCnSbGjvKGo2hcAk7Q7zW52oQzy51DOGJiS0BiBKRowIyTzbDY-nsbYFSUOxv9eX3n1dxWfbEvGIqfhxqRkoS7NLDjtITFpqmc6asXF6EyG2PgMTn9ThqOsz_kIsB5BWLd9Hy9Vw INFO : Successfully removed archive ID : rJvCnSbGjvKGo2hcAk7Q7zW52oQzy51DOGJiS0BiBKRowIyTzbDY-nsbYFSUOxv9eX3n1dxWfbEvGIqfhxqRkoS7NLDjtITFpqmc6asXF6EyG2PgMTn9ThqOsz_kIsB5BWLd9Hy9Vw ERROR : [Errno 8] nodename nor servname provided, or not known INFO : Sleep 2 mins before retrying... INFO : Retry to remove archive ID : Y8kHLWVcpj2pqTPKRPSH0_XHfZEiWYFrEfL4kx3fx7xsx8nJbYX80sUw8C2ho7Tt7v_d7oHS83Vdt0oo93t696xk-ufSmhO1JngnzrPKe-OgmR6zXkNqdpnplAqIfq-ro7toRQrpTw INFO : Successfully removed archive ID : Y8kHLWVcpj2pqTPKRPSH0_XHfZEiWYFrEfL4kx3fx7xsx8nJbYX80sUw8C2ho7Tt7v_d7oHS83Vdt0oo93t696xk-ufSmhO1JngnzrPKe-OgmR6zXkNqdpnplAqIfq-ro7toRQrpTw ERROR : [Errno 8] nodename nor servname provided, or not known INFO : Sleep 2 mins before retrying... INFO : Retry to remove archive ID : ccMIUXFu2_GbrHXB6HF47nT5fN5J01USU60bxVDPNr4NBi7juXTB9faLS7KiqNyb-r3ynM5kfplYBVFL_7t9FwCqPHg5LYYUnGy0jsdzS2QkgjGfU601d6ixK1dosoFf4XpSi_Cjjw INFO : Successfully removed archive ID : ccMIUXFu2_GbrHXB6HF47nT5fN5J01USU60bxVDPNr4NBi7juXTB9faLS7KiqNyb-r3ynM5kfplYBVFL_7t9FwCqPHg5LYYUnGy0jsdzS2QkgjGfU601d6ixK1dosoFf4XpSi_Cjjw ERROR : [Errno 8] nodename nor servname provided, or not known INFO : Sleep 2 mins before retrying... INFO : Retry to remove archive ID : -d0-2Ai0YbOM22RYy7BMn46lOmzYJLWaMn46Uic1GKCqA0J6UWW7r7yQdldW6MB5KWg7rguUY0annGZRUs3cnUs9yjQlPJh9VjZGmHa7TpzX214AWj0V4yMPkKgfuFHn2JBb2KoNyQ INFO : Successfully removed archive ID : -d0-2Ai0YbOM22RYy7BMn46lOmzYJLWaMn46Uic1GKCqA0J6UWW7r7yQdldW6MB5KWg7rguUY0annGZRUs3cnUs9yjQlPJh9VjZGmHa7TpzX214AWj0V4yMPkKgfuFHn2JBb2KoNyQ ERROR : [Errno 8] nodename nor servname provided, or not known INFO : Sleep 2 mins before retrying... INFO : Retry to remove archive ID : SODrDe1WKrVVflDV1D9YnKFugHeQPCh8y6F9U8RVNWoion83FYx8GFqx9AEqa5oMoxIR6o3py5hGMCjvjHCW7ifOzVrZPogvmjRW8NsQkd-xmvr7PbZtTRcVXHNNxPubQaM3iLwDVQ INFO : Successfully removed archive ID : SODrDe1WKrVVflDV1D9YnKFugHeQPCh8y6F9U8RVNWoion83FYx8GFqx9AEqa5oMoxIR6o3py5hGMCjvjHCW7ifOzVrZPogvmjRW8NsQkd-xmvr7PbZtTRcVXHNNxPubQaM3iLwDVQ

On Mar 14, 2014, at 3:51 AM, Leeroy Brun notifications@github.com wrote:

Does it pass some time between the errors ? Or is this direct ?

Because we try to remove as much archives as we can without time between them, so sometimes one fail because maybe of AWS blocking us for some time (typically 1 min or less). Then the script wait and restart the request, and then continue until the next error.

Can you retry with this updated version ? It will show if the removal retry worked or not. https://github.com/leeroybrun/glacier-vault-remove/blob/master/removeVault.py

— Reply to this email directly or view it on GitHub.

Rob

robmclear@me.com

leeroybrun commented 10 years ago

Ok perfect, it seems to works. I think you can just let it run, and in a couple of hours all your archives will be gone !

robmclear commented 10 years ago

There are 750,000 of them! Might take a while! Thanks for your help.

-Rob

On Mar 14, 2014, at 11:15 AM, Leeroy Brun notifications@github.com wrote:

Ok perfect, it seems to works. I think you can just let it run, and in a couple of hours all your archives will be gone !

— Reply to this email directly or view it on GitHub.

Rob

robmclear@me.com

leeroybrun commented 10 years ago

Does it worked ?

robmclear commented 10 years ago

I don't know yet. I am on vacation and will be back in two weeks, I left it running until I get back.

Rob

Rob

On Mar 17, 2014, at 8:18 AM, Leeroy Brun notifications@github.com wrote:

Does it worked ?

— Reply to this email directly or view it on GitHub.

leeroybrun commented 10 years ago

Ok, no problem. Have a nice vacation !

robmclear commented 10 years ago

Worked!! All archives deleted, finally able to delete that vault.

Thanks so much for all of your help!

-Rob McLear

On Mar 17, 2014, at 5:59 AM, Leeroy Brun notifications@github.com wrote:

Ok, no problem. Have a nice vacation !

— Reply to this email directly or view it on GitHub.

Rob

robmclear@me.com

leeroybrun commented 10 years ago

Perfect, I'm glad it finally worked !

You're welcome. :-)