splunk / splunk-shuttl

Splunk app for archive management, including HDFS support.
Apache License 2.0
36 stars 19 forks source link

Accessing frozen data in S3 #133

Closed msnelling closed 5 years ago

msnelling commented 11 years ago

I'm using an s3 backend (not s3n) with an archive format of SPLUNK_BUCKET. When I try and view the data in S3 using a file transfer program such as DragonDisk or CyberDuck I see a lot of "block_123456789...." files in the root directory and a directory with a blank name which contains the archive_root/archive_data/... directory hierarchy.

Is it possible to perform operations on the buckets stored in S3 like this? For example when I originally set up Shuttl I had it also archive in the _TGZ format but now would like to remove this old data.

Also is it possible to remove old unwanted buckets from this store?

petterik commented 11 years ago

What do you mean by "perform operations on the buckets stored in S3 like this"? Your own operations, or Shuttl operations? You cannot remove old unwanted buckets with Shuttl, but you can remove them yourself. Shuttl doesn't store any data about the state of your system. It always uses "ls" to list your current state of your system. If you remove them or move them, it'll have no effect other than they won't be visible for Shuttl.

Does that answer your question?

msnelling commented 11 years ago

I'm trying to remove them myself in S3 but can't seem to work out what I should be deleting, the archive_root/archive_data/... directory or the block_123456789 files in the root of the bucket?