numenta / nupic-legacy

Numenta Platform for Intelligent Computing is an implementation of Hierarchical Temporal Memory (HTM), a theory of intelligence based strictly on the neuroscience of the neocortex.
http://numenta.org/
GNU Affero General Public License v3.0
6.33k stars 1.56k forks source link

Boosting creates disturbance in anomaly detection #823

Open breznak opened 10 years ago

breznak commented 10 years ago

As found out by Ritchie, on a large CLA, and relatively simple problem (so columns are unused) (unexpected) spikes appear with anomaly detection after some time.

I've found out it's related to several columns being (unexpectedly) activated because of boosting, this causes anomaly spike.

Proposed solutions are:

1/ remember the boosted columns and for some time window (eg 10 steps) ignore them from the anomaly computations (mark them as inactive for the formula above). This will let the columns settle, but still can create an anomaly at the end of the window.

(I think it's safe to ignore them from the computation as those cols were unused before anyway)

1.B/ before the boost, mark the selected set of columns as predicted.

2/ small change at a time:

say we want to boost 20 columns. boost them the normal way, but stretch it over 20 iterations of the CLA - boosting 1 at a time. This will avoid anomaly.

(? could it make boosting ineffective in other cases? because it'd take longer time)

3/ be smooth boost the output of the columns selected for boosting with a nice curve (0.1 0.3 0.75 1.0 1.0 0.75 0.3 0.1) (column's output*(1+curve_coef)) over time.

(i seem to prefer 2/3 for being less hacky)

Testing could be done:

existing scenarios should not get (much) worse. And Ritchie provided a triggering example: large CLA, relatively simple problem, over time boosting kicks in and causes spikes (anomalies :P ) in anomaly detection.

TODO:

CC @subutai @chetan51 @h2suzuki

breznak commented 10 years ago

@rhyolight could you create new anomaly label? :)

breznak commented 10 years ago

discussed on ML (nupic-discuss) under thread [nupic-discuss] Oddity in sine wave example

subutai commented 10 years ago

3/ be smooth boost the output of the columns selected for boosting with a nice curve (0.1 0.3 0.75 1.0 1.0 0.75 0.3 0.1) (column's output*(1+curve_coef)) over time.

This strategy could be tested very easily. In the current implementation there is a function used now, in _updateBoostFactors. You could just modify that function. We do need some good test cases though, as pointed out by @chetan51

rhyolight commented 9 years ago

@breznak This issue is labeled blocked, can you explain why?

breznak commented 8 years ago

This is an example what adversary effect the boosting disturbances have:

(iteration, anomaly score) on constant(!) fn, using SP only.

42 0.0
43 0.0
44 0.0
45 0.0
46 0.0
47 0.0
48 0.0
49 /////////////// BOOST ///////////////////
0.0
50 0.0240963855422
51 0.120481927711
52 0.120481927711
53 0.120481927711
54 0.120481927711
55 0.120481927711
56 0.120481927711
57 0.130952380952
58 0.141176470588
59 0.141176470588
60 0.120481927711
61 0.141176470588
62 0.179775280899
63 0.188888888889
64 0.120481927711
65 0.139534883721
66 0.0
breznak commented 8 years ago

Thoughts when approaching the implementation ...

3/ be smooth boost the output of the columns selected for boosting with a nice curve (0.1 0.3 0.75 1.0 1.0 0.75 0.3 0.1) (column's output*(1+curve_coef)) over time.

This strategy could be tested very easily. In the current implementation there is a function used now, in _updateBoostFactors. You could just modify that function. We do need some good test cases though, as pointed out by @chetan51

Ok, I'm started to dislike this strategy, as I'd like to be able to set maximum allowed disturbance caused by boosting (as % of columns). It is impossible with this approach. Also, boosting already has a linear interpolation for the (1.0:maxBoost) values in updateBoostFactors.

breznak commented 8 years ago

1.B/ before the boost, mark the selected set of columns as predicted . (in TP)

Ideally, I'd like to solve it in SP only, not in TP+anomaly; but that's a way too..

breznak commented 8 years ago

1/ remember the boosted columns and for some time window (eg 10 steps) ignore them from the anomaly computations (mark them as inactive for the formula above). This will let the columns settle, but still can create an anomaly at the end of the window. (I think it's safe to ignore them from the computation as those cols were unused before anyway)

This leads to an interesting idea: create a bitmask filter that is applied atop the SP's output. This could serve to fix the ties (like current solution, CC @scottpurdy ), inverse effects of updateBoostFactors(); but still cannot cope with bumpUpWeakColumns()

breznak commented 8 years ago

Therefor as a winner to me looks

2/ small change at a time: say we want to boost 20 columns. boost them the normal way, but stretch it over 20 iterations of the CLA - boosting 1 at a time. This will avoid anomaly. (? could it make boosting ineffective in other cases? because it'd take longer time)

So separate the boost update into smaller groups, would need a boostedSettlePeriod number of steps to settle, then another group can be boosted.

breznak commented 8 years ago

Btw, this lead me to https://github.com/numenta/nupic/issues/2648

scottpurdy commented 8 years ago

Before we change the implementation in NuPIC you should show that your proposal improves some real world applications. Since you are specifically concerned about the impact on anomaly detection then you could make the change and try running the modified algorithms on NAB. You should try running it with several different seeds since the results vary somewhat based on the random seed.

The results on NAB don't necessarily have to be better, but we want to make sure they aren't worse. And if they are roughly the same then you will need to show some other benefit of the change. But NAB results are a good starting point.

breznak commented 8 years ago

Before we change the implementation in NuPIC you should show that your proposal improves some real world applications

Ok, I'll merge the relevant test and the new boosting implementation; and keep it out of tree for now.

Is a sine wave good enough example? I can show ECG data, but for real world noisy data I can't prove it's boosting-only disturbances. But we can see if/how it helps though.

scottpurdy commented 8 years ago

I suggest running it on NAB.

rhyolight commented 8 years ago

Until it can be proven that boosting is negatively affecting anomaly detection in a real example like a NAB dataset, I'm considering closing this.