Kansa Analysis: Add -AnalysisOnly flag

davehull commented 10 years ago

Add a flag for running Kansa in an analysis only mode for data that has already been collected. The user will configure the .\Analysis\Analysis.conf file according to their needs, then run Kansa with the -AnalysisOnly flag and an -DataDir argument and Kansa will run through all the appropriate analysis scripts for the data found in the argument to -DataDir.

jvaldezjr1 commented 9 years ago

Will the/ Should the analysis.conf file drive the analysis modules? Meaning, we pass the directory with the data (for example: Output_201506051234) and the analysis modules run accordingly? Are most of the analysis mods leveraging logparser, and are the scripts themselves looking for the aptly named output file corresponding to the module that generated it?

Would it be better to run the module based on the filename, whatever data is in the directory drive what modules should be run?

davehull commented 9 years ago

As it is coded today, if Kansa is run with the -Analysis flag, it will run the normal data collection based on the modules\modules.conf file, then after all of the data comes in, it will execute the analysis scripts according to the the analysis\analysis.conf file. Each analysis script has a line in the .SYNOPSIS section that starts with DATADIR followed by the name of the data directory where that script will find its input. Eample, Analysis\ASEP\Get-ASEPImagePathLaunchStringMD5Stack.ps1's DATADIR is Autorunsc, so Kansa will run that script against output_yyyyMMddHHmmss\Autorunsc*autorunsc.tsv and yes, LogParser is used by almost every analysis script.

There's a lot of clean up that could be done here. Overall I think the architecture for this is ok, but Analysis\Get-LogParserStack.ps1 replaces all of the analysis scripts that do stack ranking. Get-LogParserStack.ps1 can run against any csv or tsv file (other delimiters are also supported). It reads the headers of the input files and figures out what the schema is, then prompts the user interactively for the fields they want to stack on. It allows you to save your LogParser query to a file so you can edit it and tweak it, if needed and allows you save the output of the query to a file. It's very flexible, more so than the 30+ analysis scripts. I wrote it because I wanted something that would work for many different data sets, rather than having to maintain a specific analysis script for all the different data sets.

There are analysis scripts that don't do stack ranking and certainly we'll add more in the future. So not everything is a stacker or frequency analysis problem. But the bulk of the existing analysis scripts are redundant at this point, given Get-LogParserStack.ps1.

davehull / Kansa

Kansa Analysis: Add -AnalysisOnly flag #56