Closed breznak closed 8 years ago
@jefffohl please take a look. It needs a few touches (the UI) and is working pretty well.
@breznak - this seems to work well (excepting a few details, as mentioned in my notes). I suggest that we merge only after I have added some UI controls, and we have dealt with the no-timestamp problem.
:+1:
@jefffohl thanks for your review! yes, waiting to resolve the issues. I'll do the timestamp and if you can make the UI..
resolved merge from last PR, works OK.
OK - thanks. Working on the UI elements now.
@breznak - is there a reason that we want to allow only one field to be highlighted at a time? Perhaps it would be better to let the user choose as many fields as they wish?
@jefffohl yes, i was thinking about that. We could also change condition, eg <threshold
, or even combine several comparisons together: price<cash
, ... but that would require a "condition editor and parser", and I didn't want to overdo it.
So I'd suggest just 1 field for anomaly highlighting for now, and we can extend in another PR, what do you think? We'll also need this code for multiple-fields highlight for #31 , this functionality should be relatively easy to implement.
I actually already added the functionality for multiple field highlighting in my PR: https://github.com/breznak/nupic.visualizations/pull/1
@jefffohl this looks awesome! :100: Both the UI for highlighting and the ability to highlight any series! And you've also fixed #74 :)
OK, great! Seems like we can merge this PR then.
Thank you very much!
I'll just try to play a bit with the opacity/radius of the highlight - on bigger data I find it uneasy to spot; eg on OPF/hotgym_full.csv
for anomalyScore>0.4, what do you think?
Yes, I tried to make a formula for determining the opacity based on the radius. Note that the color fo rthei highlight is always the color of the series it is associated with. I found that it with a large radius, the opacity quickly reached 1, which made it impossible to differentiate from the series line. With the formula, it is guaranteed to always be visible against the plot line.
Another approach might be to not use opacity, but calculate a solid, but lighter, color than the plot line.
@jefffohl this fixes the visibility of point anomalies: area is set as 1% of the graph, relative to number of items, and overlapping anomalies: new highlight only if dist > radius from last highlight.
Questionable is threshold behavior for scaled series, should we conform to the "visible" values, or original (using backupCSV
)?, in the former highlights change as we toggle Normalize
.
Thanks @breznak - though it seems to not work exactly right. In the example OPF file, there are four anomaly scores with a value of "1". Setting the threshold to "1" shows nothing, whereas setting the threshold to 0.99 does show something.
Yes, we should should deal with the normalized data issue as well.
thanks @jefffohl I'll fix the corner case for highlights! About Normalized: both ways have their support, which do you prefer?
@breznak - I found the problem. It is because DyGraphs is rounding the numbers. We can address this by setting the significant figures option: http://dygraphs.com/options.html#sigFigs
So, I was misled by the label into thinking that values of 1 were not being highlighted, but in actuality, there are no values of 1 in that data set.
@breznak - I am ambivalent about what method we use for dealing with normalized values, but what we want is for the highlighting to not change when toggling the "normalized" option on and off.
About sigFigs
, as we can't know what will the precision of series be, we shouldn't set it. I think it's ok as we highlight correctly according to the real raw data, just the labels show rounding sometimes.
@jefffohl should be fixed. Does https://github.com/breznak/nupic.visualizations/commit/4afaa5fbe4dcc51075d2867c6683988d8dc01620 really cause a signif. slowdown? (feels like yes here), so highlightSeriesOpts
should be again disabled...
@jefffohl I think that's all. can we merge?
@breznak - I think we should use sigFigs
. Much of the useful anomaly scores are found as differences between ranges of 0.9 and 0.999. I suggest setting the sigFigs
value to 5.
@jefffohl added the sigFigs
as you suggested and merging..
@breznak - thanks! On to the next...
Fixes #3 Fixes #74
New functionality to highlight select series at points where the function value crosses a threshold (significant anomalies, for example)