htm-community / htm.core

Actively developed Hierarchical Temporal Memory (HTM) community fork (continuation) of NuPIC. Implementation for C++ and Python
http://numenta.org
GNU Affero General Public License v3.0
148 stars 74 forks source link

Problems with anomaly detection #938

Open zrosin opened 3 years ago

zrosin commented 3 years ago

I'm having trouble performing anomaly detection in python. I'm using the hotgym example and am struggling to detect anomalies. I posted on the htm forum earlier and think this is worthy of an issue here.

First I want to point out that the anomaly likelihood class isn't actually being used there, despite that it seems to be working correctly. It's already in the code, just never used. Anyways, I'm pretty sure I'm getting the expected results from the example, but I never was a fan of this dataset because the actual anomalies are hard to see. image

In the forum post I mentioned above I get some useful advice on ways to try to debug this, The images and questions there may help with some context. but it seems to me that the tm.anomaly is not working correctly. Rather than increasing at anomalies, it decreases. Removing the date encoder does seem to fix this, but obviously removes the temporal context of the data, and is not a real solution. Here's hotgym running with the date encoding removed on a custom data set. It seems to catch the anomalies at 300 and 400, but the noise prevents 300 from being detected and delays 400 from being detected. image

I wanted to compare this to NuPIC just to see if the results were actually correct and my understanding was wrong but it seems like NuPIC is working properly. Do note that the exact parameters are not exactly the same for the two runs, htm.core is running the hotgym.py parameters, but NuPIC is running these parameters. image

I'm wondering if I'm doing something critically wrong or if there is an actual issue here. Thanks in advance for the help.

ctrl-z-9000-times commented 3 years ago

Hi, I'm able to reproduce this issue. I see what you mean about the example using raw anomaly scores instead of the AnomalyLikelihood.

ctrl-z-9000-times commented 3 years ago

So I took a look further into this issue...

The python code for the AnomalyLikelihood class is a mess. I'm pretty sure there are a few bugs in there. Also, it contains a bunch of special cases for detecting anomalies in situations where HTM systematically fails to.

So I rewrote the class to work much better. Its on a branch of this repo: git checkout anomaly_likelihood_rewrite. I changed the API so you'll have to modify your program if you want to use it. It performs worse on the NAB benchmark, probably because I removed all of the special cases.

The hot gym example now looks like: hotgym

zrosin commented 3 years ago

Thanks for the quick reply.

I don't believe the source of my issues was the likelihood class. I think it was due to either an error in or a misuse of the anomaly score function.

In this figure you can see anomaly score (blue) drop at anomaly just after 200, drop at anomaly just after 300, and drop at anomaly just after 400. You can also see it rise around 6-700 despite no change in data. image The anomaly likelihood manages to save the day and marks the first 2 anomalies anyways.

Out of curiosity I removed the date component from the data just to see if that would have any effect and it surprisingly gave better anomaly score, but still has some funky patterns. image

romanma9999 commented 2 years ago

Hi @zrosin Did you manage to solve / find out eventually what was the problem ? Was it bad implementation of AnomalyLikelihood class like @ctrl-z-9000-times suggested or something else?

zrosin commented 2 years ago

@romanma9999 I don’t have the expertise to actually find the problem here. The predictions work correctly, so HTM under the hood is running correctly. The anomaly reporting is the only problem, whether this is from the c++ part of the implementation or the switch back to python I don’t know.

I ran the program with NUPIC just to sanity check myself, and that worked, so I just switched to using that.

If you are looking at trying to fix it, you may want to isolate the problem first, so make sure C++ is working before trying to change python side.

breznak commented 2 years ago

@zrosin (sorry I just read this thread briefly) wanted to let you know of the recent rewrite of anomaly likelihood by David, https://github.com/htm-community/htm.core/pull/958