bulk glacier restore - Githubissues

ref: https://github.com/geerlingguy/my-backup-plan#retriving-content-from-s3-glacier-deep-archive

You can't just download files from Glacier directly, the trick is to move it to S3 first. Once there, you can start to download all of your files.

Step 1: Create a list of all items that you would like to be restored. I included the prefix option if you want specific folders, but if you are trying to restore a whole bucket it is not necessary.

aws s3api list-objects-v2 --bucket <insert-bucket-name-here> --prefix <only use this if you are interested in a specific folder> --query "Contents[?StorageClass=='DEEP_ARCHIVE']" --output text | awk '{print $2}' > glacier-restore.txt

Step 2: Run this command to start pulling all of the files from "Deep Archive" to S3. If you have time, you can sub "Bulk" for "Standard" but its not that much difference in cost so I go with standard.

for x in `cat glacier-restore.txt`; do aws s3api restore-object --bucket si-dp-prod --key $x --restore-request '{"Days":90,"GlacierJobParameters":{"Tier":"Standard"}}'; done

Unfortunately, the above command will only do one file at a time, but we have hardware and want to save some time. The following command will spread this on all minus one available cores.

cat glacier-restore.txt | xargs -n $(($(nproc --all)-1)) -P $(($(nproc --all)-1)) sh -c 'for x in "$@"; do aws s3api restore-object --bucket si-dp-prod --key "$x" --restore-request "{\"Days\":90,\"GlacierJobParameters\":{\"Tier\":\"Standard\"}}"; echo "Done restoring $x"; done' sh

Once all files have finished the retrieval process, I would suggest running through Steps 1 and 2 again, I found that sometimes it would skip entries but I never did figure out why.

geerlingguy / my-backup-plan

bulk glacier restore #11