splunk / splunk-shuttl

Splunk app for archive management, including HDFS support.
Apache License 2.0
36 stars 19 forks source link

Glacier ArchiveIds, getting, persisting and caching #92

Closed petterik closed 12 years ago

petterik commented 12 years ago

When downloading a file from glacier a special ArchiveId is needed. The archiveId is mapped it to a bucket archive path, but it's only stored in memory at the moment. A reboot of the system would not have any ArchiveIDs. It's possible but expensive to get them from glacier. It takes around 4 hour and some money.

Figure out how and where the archiveIds should be gotten, listed, cached etc.

One idea is to persist it as metadata in the archive file system, which s3/s3n for glacier.