htm-community / nupic.visualizations

Web application for interactive graphs, anomaly highlighting and online monitoring.
MIT License
17 stars 11 forks source link

Anomaly Highlighting #68

Closed breznak closed 8 years ago

breznak commented 8 years ago

Fixes #3 Fixes #74

New functionality to highlight select series at points where the function value crosses a threshold (significant anomalies, for example)

breznak commented 8 years ago

@jefffohl please take a look. It needs a few touches (the UI) and is working pretty well.

jefffohl commented 8 years ago

@breznak - this seems to work well (excepting a few details, as mentioned in my notes). I suggest that we merge only after I have added some UI controls, and we have dealt with the no-timestamp problem.

:+1:

breznak commented 8 years ago

@jefffohl thanks for your review! yes, waiting to resolve the issues. I'll do the timestamp and if you can make the UI..

breznak commented 8 years ago

resolved merge from last PR, works OK.

jefffohl commented 8 years ago

OK - thanks. Working on the UI elements now.

jefffohl commented 8 years ago

@breznak - is there a reason that we want to allow only one field to be highlighted at a time? Perhaps it would be better to let the user choose as many fields as they wish?

breznak commented 8 years ago

@jefffohl yes, i was thinking about that. We could also change condition, eg <threshold, or even combine several comparisons together: price<cash, ... but that would require a "condition editor and parser", and I didn't want to overdo it.

So I'd suggest just 1 field for anomaly highlighting for now, and we can extend in another PR, what do you think? We'll also need this code for multiple-fields highlight for #31 , this functionality should be relatively easy to implement.

jefffohl commented 8 years ago

I actually already added the functionality for multiple field highlighting in my PR: https://github.com/breznak/nupic.visualizations/pull/1

breznak commented 8 years ago

@jefffohl this looks awesome! :100: Both the UI for highlighting and the ability to highlight any series! And you've also fixed #74 :)

jefffohl commented 8 years ago

OK, great! Seems like we can merge this PR then.

breznak commented 8 years ago

Thank you very much! I'll just try to play a bit with the opacity/radius of the highlight - on bigger data I find it uneasy to spot; eg on OPF/hotgym_full.csv for anomalyScore>0.4, what do you think?

jefffohl commented 8 years ago

Yes, I tried to make a formula for determining the opacity based on the radius. Note that the color fo rthei highlight is always the color of the series it is associated with. I found that it with a large radius, the opacity quickly reached 1, which made it impossible to differentiate from the series line. With the formula, it is guaranteed to always be visible against the plot line.

Another approach might be to not use opacity, but calculate a solid, but lighter, color than the plot line.

breznak commented 8 years ago

@jefffohl this fixes the visibility of point anomalies: area is set as 1% of the graph, relative to number of items, and overlapping anomalies: new highlight only if dist > radius from last highlight.

breznak commented 8 years ago

Questionable is threshold behavior for scaled series, should we conform to the "visible" values, or original (using backupCSV)?, in the former highlights change as we toggle Normalize.

jefffohl commented 8 years ago

Thanks @breznak - though it seems to not work exactly right. In the example OPF file, there are four anomaly scores with a value of "1". Setting the threshold to "1" shows nothing, whereas setting the threshold to 0.99 does show something.

jefffohl commented 8 years ago

Yes, we should should deal with the normalized data issue as well.

breznak commented 8 years ago

thanks @jefffohl I'll fix the corner case for highlights! About Normalized: both ways have their support, which do you prefer?

jefffohl commented 8 years ago

@breznak - I found the problem. It is because DyGraphs is rounding the numbers. We can address this by setting the significant figures option: http://dygraphs.com/options.html#sigFigs

So, I was misled by the label into thinking that values of 1 were not being highlighted, but in actuality, there are no values of 1 in that data set.

jefffohl commented 8 years ago

@breznak - I am ambivalent about what method we use for dealing with normalized values, but what we want is for the highlighting to not change when toggling the "normalized" option on and off.

breznak commented 8 years ago

About sigFigs, as we can't know what will the precision of series be, we shouldn't set it. I think it's ok as we highlight correctly according to the real raw data, just the labels show rounding sometimes.

breznak commented 8 years ago

@jefffohl should be fixed. Does https://github.com/breznak/nupic.visualizations/commit/4afaa5fbe4dcc51075d2867c6683988d8dc01620 really cause a signif. slowdown? (feels like yes here), so highlightSeriesOpts should be again disabled...

breznak commented 8 years ago

@jefffohl I think that's all. can we merge?

jefffohl commented 8 years ago

@breznak - I think we should use sigFigs. Much of the useful anomaly scores are found as differences between ranges of 0.9 and 0.999. I suggest setting the sigFigs value to 5.

breznak commented 8 years ago

@jefffohl added the sigFigs as you suggested and merging..

jefffohl commented 8 years ago

@breznak - thanks! On to the next...