splunk / splunk-shuttl

Splunk app for archive management, including HDFS support.
Apache License 2.0
36 stars 19 forks source link

warmToColdScript QA #110

Closed petterik closed 11 years ago

petterik commented 11 years ago

Test the warmToColdScript:

Assert:

Try being evil by:

Example indexes.conf: [archiver-test-index] homePath = $SPLUNK_HOME/var/lib/splunk/archiver-test-index/db coldPath = $SPLUNK_HOME/var/lib/splunk/archiver-test-index/colddb thawedPath = $SPLUNK_HOME/var/lib/splunk/archiver-test-index/thaweddb warmToColdScript = $SPLUNK_HOME/etc/apps/shuttl/bin/warmToColdScript.sh rotatePeriodInSecs = 5 maxWarmDBCount = 1 maxDataSize =

Note: warmToColdScript.sh does not take an extra parameter. (winning).

Klevmarken commented 11 years ago

When allowRemoteLogin on the lone-slave is not set to always then:

2012-12-17 18:45:08,396 ERROR com.splunk.shuttl.archiver.copy.ColdCopyEntryPoint: did="Called main entry point for copying bucket" happened="com.splunk.HttpException: HTTP 401 -- Remote login has been disabled for 'admin' with the default password. Either set the password, or override by changing the 'allowRemoteLogin' setting in your server.conf file." expected="to eventually call copy bucket REST endpoint" main_args="[/mnt/KlevisTestArea/splunk/var/lib/splunk/cluster_shuttl_test/colddb/db_1355769149_1336596984_4]"

Set allowRemoteLogin = always, restarted splunk before it had finished to index all the data it had been fed. Killed the shuttl process since it would not reboot by itself. Once splunk and shuttl booted up again shuttl moved all the buckets from db. The following errors were encountered:

2012-12-17 18:48:31,526 ERROR com.splunk.shuttl.archiver.filesystem.transaction.AbstractTransaction: did="Transferred data with transaction: Transaction [data=LocalBucket [getDirectory()=/mnt/KlevisTestArea/splunk/var/lib/splunk/cluster_shuttl_test/colddb/db_1353984375_1344011675_8, getName()=db_1353984375_1344011675_8, getIndex()=cluster_shuttl_test, getFormat()=SPLUNK_BUCKET, getPath()=/mnt/KlevisTestArea/splunk/var/lib/splunk/cluster_shuttl_test/colddb/db_1353984375_1344011675_8, getEarliest()=Fri Aug 03 16:34:35 UTC 2012, getLatest()=Tue Nov 27 02:46:15 UTC 2012, getSize()=604863335], remoteTemp=/user/ec2-user/KlevisTestAreaLoneSlave/temporary_data/the-lone-slave/user/ec2-user/KlevisTestAreaLoneSlave/archive_data/local/the-lone-slave/cluster_shuttl_test/db_1353984375_1344011675_8/SPLUNK_BUCKET, dst=/user/ec2-user/KlevisTestAreaLoneSlave/archive_data/local/the-lone-slave/cluster_shuttl_test/db_1353984375_1344011675_8/SPLUNK_BUCKET]" happened="java.io.FileNotFoundException: File file:/mnt/KlevisTestArea/splunk/var/lib/splunk/cluster_shuttl_test/colddb/db_1353984375_1344011675_8/splunk-need-optimize.dat does not exist." expected="To transfer file to remote file system via temp." from="LocalBucket [getDirectory()=/mnt/KlevisTestArea/splunk/var/lib/splunk/cluster_shuttl_test/colddb/db_1353984375_1344011675_8, getName()=db_1353984375_1344011675_8, getIndex()=cluster_shuttl_test, getFormat()=SPLUNK_BUCKET, getPath()=/mnt/KlevisTestArea/splunk/var/lib/splunk/cluster_shuttl_test/colddb/db_1353984375_1344011675_8, getEarliest()=Fri Aug 03 16:34:35 UTC 2012, getLatest()=Tue Nov 27 02:46:15 UTC 2012, getSize()=604863335]" to="/user/ec2-user/KlevisTestAreaLoneSlave/archive_data/local/the-lone-slave/cluster_shuttl_test/db_1353984375_1344011675_8/SPLUNK_BUCKET" temp="/user/ec2-user/KlevisTestAreaLoneSlave/temporary_data/the-lone-slave/user/ec2-user/KlevisTestAreaLoneSlave/archive_data/local/the-lone-slave/cluster_shuttl_test/db_1353984375_1344011675_8/SPLUNK_BUCKET" 2012-12-17 18:48:31,527 ERROR com.splunk.shuttl.archiver.filesystem.transaction.TransactionExecuter: did="Executed transaction" happened="com.splunk.shuttl.archiver.filesystem.transaction.TransactionException: java.io.FileNotFoundException: File file:/mnt/KlevisTestArea/splunk/var/lib/splunk/cluster_shuttl_test/colddb/db_1353984375_1344011675_8/splunk-need-optimize.dat does not exist." expected="Transaction to prepare and commit" transaction="Transaction [data=LocalBucket [getDirectory()=/mnt/KlevisTestArea/splunk/var/lib/splunk/cluster_shuttl_test/colddb/db_1353984375_1344011675_8, getName()=db_1353984375_1344011675_8, getIndex()=cluster_shuttl_test, getFormat()=SPLUNK_BUCKET, getPath()=/mnt/KlevisTestArea/splunk/var/lib/splunk/cluster_shuttl_test/colddb/db_1353984375_1344011675_8, getEarliest()=Fri Aug 03 16:34:35 UTC 2012, getLatest()=Tue Nov 27 02:46:15 UTC 2012, getSize()=604863335], remoteTemp=/user/ec2-user/KlevisTestAreaLoneSlave/temporary_data/the-lone-slave/user/ec2-user/KlevisTestAreaLoneSlave/archive_data/local/the-lone-slave/cluster_shuttl_test/db_1353984375_1344011675_8/SPLUNK_BUCKET, dst=/user/ec2-user/KlevisTestAreaLoneSlave/archive_data/local/the-lone-slave/cluster_shuttl_test/db_1353984375_1344011675_8/SPLUNK_BUCKET]"

2012-12-17 18:48:31,534 ERROR com.splunk.shuttl.archiver.archive.ArchiveBucketTransferer: did="Executed a bucket transaction." happened="com.splunk.shuttl.archiver.filesystem.transaction.TransactionException: java.io.FileNotFoundException: File file:/mnt/KlevisTestArea/splunk/var/lib/splunk/cluster_shuttl_test/colddb/db_1353984375_1344011675_8/splunk-need-optimize.dat does not exist." expected="To transfer the bucket to the archive." bucket="LocalBucket [getDirectory()=/mnt/KlevisTestArea/splunk/var/lib/splunk/cluster_shuttl_test/colddb/db_1353984375_1344011675_8, getName()=db_1353984375_1344011675_8, getIndex()=cluster_shuttl_test, getFormat()=SPLUNK_BUCKET, getPath()=/mnt/KlevisTestArea/splunk/var/lib/splunk/cluster_shuttl_test/colddb/db_1353984375_1344011675_8, getEarliest()=Fri Aug 03 16:34:35 UTC 2012, getLatest()=Tue Nov 27 02:46:15 UTC 2012, getSize()=604863335]"

2012-12-17 18:48:58,348 ERROR com.splunk.shuttl.archiver.copy.CallCopyBucketEndpoint: did="Called copy bucket endpoint" happened="org.apache.http.NoHttpResponseException: The target server failed to respond" expected="to execute without failure" bucket="LocalBucket [getDirectory()=/mnt/KlevisTestArea/splunk/var/lib/splunk/cluster_shuttl_test/colddb/db_1355769149_1336596984_4, getName()=db_1355769149_1336596984_4, getIndex()=cluster_shuttl_test, getFormat()=SPLUNK_BUCKET, getPath()=/mnt/KlevisTestArea/splunk/var/lib/splunk/cluster_shuttl_test/colddb/db_1355769149_1336596984_4, getEarliest()=Wed May 09 20:56:24 UTC 2012, getLatest()=Mon Dec 17 18:32:29 UTC 2012, getSize()=1065418738]" 2012-12-17 18:48:58,350 ERROR com.splunk.shuttl.archiver.copy.LockedBucketCopier: did="Call copy endpoint to copy bucket" happened="com.splunk.shuttl.archiver.copy.CallCopyBucketEndpoint$NonSuccessfulBucketCopy: org.apache.http.NoHttpResponseException: The target server failed to respond" expected="to copy and then create a copy receipt" bucket="LocalBucket [getDirectory()=/mnt/KlevisTestArea/splunk/var/lib/splunk/cluster_shuttl_test/colddb/db_1355769149_1336596984_4, getName()=db_1355769149_1336596984_4, getIndex()=cluster_shuttl_test, getFormat()=SPLUNK_BUCKET, getPath()=/mnt/KlevisTestArea/splunk/var/lib/splunk/cluster_shuttl_test/colddb/db_1355769149_1336596984_4, getEarliest()=Wed May 09 20:56:24 UTC 2012, getLatest()=Mon Dec 17 18:32:29 UTC 2012, getSize()=1065418738]" 2012-12-17 18:48:58,362 ERROR com.splunk.shuttl.archiver.copy.CallCopyBucketEndpoint: did="Called copy bucket endpoint" happened="org.apache.http.NoHttpResponseException: The target server failed to respond" expected="to execute without failure" bucket="LocalBucket [getDirectory()=/mnt/KlevisTestArea/splunk/var/lib/splunk/cluster_shuttl_test/colddb/db_1353984375_1344011675_8, getName()=db_1353984375_1344011675_8, getIndex()=cluster_shuttl_test, getFormat()=SPLUNK_BUCKET, getPath()=/mnt/KlevisTestArea/splunk/var/lib/splunk/cluster_shuttl_test/colddb/db_1353984375_1344011675_8, getEarliest()=Fri Aug 03 16:34:35 UTC 2012, getLatest()=Tue Nov 27 02:46:15 UTC 2012, getSize()=604863335]" 2012-12-17 18:48:58,364 ERROR com.splunk.shuttl.archiver.copy.LockedBucketCopier: did="Call copy endpoint to copy bucket" happened="com.splunk.shuttl.archiver.copy.CallCopyBucketEndpoint$NonSuccessfulBucketCopy: org.apache.http.NoHttpResponseException: The target server failed to respond" expected="to copy and then create a copy receipt" bucket="LocalBucket [getDirectory()=/mnt/KlevisTestArea/splunk/var/lib/splunk/cluster_shuttl_test/colddb/db_1353984375_1344011675_8, getName()=db_1353984375_1344011675_8, getIndex()=cluster_shuttl_test, getFormat()=SPLUNK_BUCKET, getPath()=/mnt/KlevisTestArea/splunk/var/lib/splunk/cluster_shuttl_test/colddb/db_1353984375_1344011675_8, getEarliest()=Fri Aug 03 16:34:35 UTC 2012, getLatest()=Tue Nov 27 02:46:15 UTC 2012, getSize()=604863335]" 2012-12-17 18:48:58,426 DEBUG com.splunk.shuttl.archiver.importexport.ShellExecutor: did="Waited for csv export process to finish." happened="java.lang.InterruptedException" expected="It to finish." 2012-12-17 18:48:58,426 DEBUG com.splunk.shuttl.archiver.importexport.csv.BucketToCsvFileExporter: did="Exported a bucket to Csv" happened="Got a non zero exit code from export tool" expected="Zero exit code from export tool." exit_code="3" csv_file="/mnt/KlevisTestArea/shuttl_archiver/data/format-export-dir/cluster_shuttl_test/db_1355769149_1336596984_4/SPLUNK_BUCKET/db_1355769149_1336596984_4.csv" command="[/mnt/KlevisTestArea/splunk/bin/splunk, cmd, exporttool, /mnt/KlevisTestArea/splunk/var/lib/splunk/cluster_shuttl_test/colddb/db_1355769149_1336596984_4, /mnt/KlevisTestArea/shuttl_archiver/data/format-export-dir/cluster_shuttl_test/db_1355769149_1336596984_4/SPLUNK_BUCKET/db_1355769149_1336596984_4.csv, -csv]"

2012-12-17 18:48:58,427 ERROR com.splunk.shuttl.server.mbeans.rest.ShuttlBucketEndpoint: did="Tried archiving a bucket" happened="com.splunk.shuttl.archiver.importexport.csv.CsvExportFailedException: Exporttool exited with non zero exit status: 3. Ran exporttool with command: [/mnt/KlevisTestArea/splunk/bin/splunk, cmd, exporttool, /mnt/KlevisTestArea/splunk/var/lib/splunk/cluster_shuttl_test/colddb/db_1353984375_1344011675_8, /mnt/KlevisTestArea/shuttl_archiver/data/format-export-dir/cluster_shuttl_test/db_1353984375_1344011675_8/SPLUNK_BUCKET/db_1353984375_1344011675_8.csv, -csv]" expected="To archive the bucket" index="cluster_shuttl_test" bucket_path="/mnt/KlevisTestArea/splunk/var/lib/splunk/cluster_shuttl_test/colddb/db_1353984375_1344011675_8" 2012-12-17 18:48:58,440 ERROR com.splunk.shuttl.server.mbeans.rest.ShuttlBucketEndpoint: did="Tried archiving a bucket" happened="com.splunk.shuttl.archiver.importexport.csv.CsvExportFailedException: Exporttool exited with non zero exit status: 3. Ran exporttool with command: [/mnt/KlevisTestArea/splunk/bin/splunk, cmd, exporttool, /mnt/KlevisTestArea/splunk/var/lib/splunk/cluster_shuttl_test/colddb/db_1355769149_1336596984_4, /mnt/KlevisTestArea/shuttl_archiver/data/format-export-dir/cluster_shuttl_test/db_1355769149_1336596984_4/SPLUNK_BUCKET/db_1355769149_1336596984_4.csv, -csv]" expected="To archive the bucket" index="cluster_shuttl_test" bucket_path="/mnt/KlevisTestArea/splunk/var/lib/splunk/cluster_shuttl_test/colddb/db_1355769149_1336596984_4"

Of the 5 buckets that ended up in colddb on the lone-slave 4 were transfered to the local hdfs partition. None of the buckets were replicated and hence transfered to the other hadoop nodes. I am running with two separate storage paths. The path used on the lone-slave is: /user/ec2-user/KlevisTestAreaLoneSlave/archive_data/local/the-lone-slave/cluster_shuttl_test The path used on the other splunk/hadoop nodes is: /user/ec2-user/KlevisTestArea/archive_data/hdfs/the-lone-slave/cluster_shuttl_test

petterik commented 11 years ago

Works!