RunWith-IT Analytic Engine

This package is the Analytic Engine in the RunWith-IT stack.

Requirements

r-statistics
node 4.6.2
npm (should be installed automatically when node is installed).
gulp (do via npm install -g gulp)
For testing, npm install -g jasmine
npm install -g node-gyp
sudo apt-get install libkrb5-dev libgssapi-krb5-2
npm install -g fibers
Install mongodb

Instructions

npm install -g gulp
npm install
gulp # This builds the src to the dist directory
cd dist
node r-adapter.js # This is a test file and for testing r-statistics with node.
'node dist/long-running-process-test.js' #this will test comparing all metrics, which takes about 2 hours depending on hardware and internet bandwidth

Testing

Tests utilize the Jasmine test framework. They should all be placed in spec/analytic-engine directory. Note that we do not test any async methods with Jasmine as the famework seems to have issues with multithreading tests

npm install
npm test

Structure

r-modules/ - Contains *.R files which is called by the javascript files in src/ directory.
src/ - Javascript files. We use ES6 since it is awesome.
dist/ - Doesn't exist at first until gulp is executed. This contains the "compiled" .js files
spec/ - Contains unit test directory.

How To:

Entailment Search:

node ./dist/cli.js entailment_search \
--goal-metric invidi.webapp.localhost_localdomain.request.total_response_time.mean \
--goal-metric-time-begin 05:00_20160917 --goal-metric-time-end 12:00_20160917 \
--time-begin 00:00_20160917 --time-end 23:00_20160917 \
--iteration-count 10000 \
--out /tmp/temp-result.json --dashboard-out /tmp/dashboard.json

CRON TASK

using the following api call, setting the date and metric parameters to return a list of deviant datapoints is possible.

http://162.246.157.107:8888/call?mdate1=1474110000&mdate2=1474111800&m1=invidi.webapp.localhost_localdomain.request.total_response_time.mean&m2=invidi.webapp.localhost_localdomain.database.request.findEtl.error_gauge&func=2

There is also a ui at: http://runwithittest.azurewebsites.net/

which uses this api to compare metrics and find deviant points.

TODO:

we need more robust interpolation of data points (currently, I think we might miss out on local minima and maxima in a dataset which could skew the results of covarance and correlation analysis)
we need to make sure that when comapring sets of datapoints we comapre points which have the same time spacing. If there are differing intervals or gaps in a metric, we need to represent that in the number of datapoints for that metric (currently we assume that we are always comapring the same span of time and we simply interpolate more points in one of the metrics to match the other. We always create interpolated sets with even spacing in the timeframe and we need to ensure that is the case for the other metric as well. Possibly this means that we should interpolate both metrics, but I think we need to address whether interpolating is causing the data to lose possible points of interest which line up in time to points in the other data set anyway)
we need to create the api to talk to a front end of some kind. This api just needs to call certain methods or functions which are performing anaysis.
we need to save the results of anaysis in the database in case the program terminate
we need a way to convert that saved output into a JSON grafana dashboard (ex. top 20 most correlated metrics to the search should create a dashboard with those metrics ordered on the page)

CMPUT401Group / analytic-engine

readme