htm-community / NAB

The Numenta Anomaly Benchmark
GNU Affero General Public License v3.0
3 stars 3 forks source link

Fixing spatial anomaly for htmcore WIP #15

Open breznak opened 5 years ago

breznak commented 5 years ago

So, we have a problem.. hopefully a combination of the params for HTMcore is not optimal.

The Numenta detectors have an "artificial" way of detecting "spatial anomaly" (which is a running upper/lower bounds threshold detector)

If I disable such (fake) detector for our htmcore the scores become pretty bad:

    "htmcore": {
-        "reward_low_FN_rate": 66.1922106936328,
-       "reward_low_FP_rate": 58.7930712907694,
-        "standard": 63.081419488725054
+        "reward_low_FN_rate": 28.85203529943721,
+       "reward_low_FP_rate": 23.591809889328452,
+       "standard": 26.783416390999154

This means most of the work in htmcore with these settings is done in the threshold detector.

Luckily, the numenta detectors & settings do not suffer this drop when spatial-threshold is removed.

TL;DR: we must optimize our net without this "spatial anomaly" threshold-detector.

For #6

breznak commented 5 years ago

CC @ctrl-z-9000-times

breznak commented 5 years ago

~~One interesting fact, using synapse competition from https://github.com/htm-community/htm.core/pull/584 leads to better utilization of synapses (probably counters some ill combination of params) which leads to our results getting back to reasonable again.~~

HTMcore without "spatial anomaly" and with "synapse competition":

--- a/results/final_results.json
+++ b/results/final_results.json
@@ -15,9 +15,9 @@
"standard": 16.43666922426724
},
"htmcore": {
-        "reward_low_FN_rate": 66.1922106936328,
-        "reward_low_FP_rate": 58.7930712907694,
-        "standard": 63.081419488725054
+        "reward_low_FN_rate": 59.48811919632727,
+        "reward_low_FP_rate": 53.995064634304576,
+        "standard": 56.90459258759435
},

EDIT: Ok, scratch that. I apparently ran NAB on an incorrect branch, the results are still bad w/ syn_competition and w/o spatial anomaly.

ctrl-z-9000-times commented 5 years ago

Yea I don't know what to say here. Numenta did several non-biological things in order score higher on the NAB benchmark.

I think we should disregard both the spatial anomaly and the backtracking TM, since both are non-biological methods which improve scores on this particular benchmark (but probably not all benchmarks).

breznak commented 5 years ago

Numenta did several non-biological things in order score higher on the NAB benchmark. ..backtrackingTM

yep, we're catching up to "HTM" with backTM, even Numenta lists the numentaTM detecot, which uses biological TM and our htmcore detector has results close to that one: https://github.com/htm-community/htm.core/issues/391

disregard both the spatial anomaly

definitely we should. I was really worried even Numenta's detector suffer the problem (most work done by "spatial anomaly"), fortunately it is not a problem (so the NAB paper is valid!), unfortunately, our TM suffers w/o the spatial anomaly. For param optimization we must not use the "spatial anomaly". I'll adapt the PR not-to use it on our detector (which is the correct approach). Not sure what to do with Numenta's, probably leave as is?

breznak commented 4 years ago

@ctrl-z-9000-times I'll try to revisit this again, the issue is a weird/severe performance regression. Please help me investigating if you have time, thanks

breznak commented 4 years ago

Update: When running on only the artificial/synthetic labels, our results are quite good: python run.py -d htmcore --detect --score --optimize --normalize --windowsFile labels/synthetic.json -n 8

htmcore detector benchmark scores written to /mnt/store/devel/HTM/NAB/results/htmcore/htmcore_reward_low_FN_rate_scores.csv

Running score normalization step
Final score for 'htmcore' detector on 'standard' profile = 84.84
Final score for 'htmcore' detector on 'reward_low_FP_rate' profile = 84.36
Final score for 'htmcore' detector on 'reward_low_FN_rate' profile = 88.37
Final scores have been written to /mnt/store/devel/HTM/NAB/results/final_results.json.
breznak commented 4 years ago

@steinroe please have a look at #26 and we can merge it here.

psteinroe commented 4 years ago

FYI score using localAreaDensity without any optimization ;) We are getting there!

Final score for 'htmcore' detector on 'standard' profile = 66.30
Final score for 'htmcore' detector on 'reward_low_FP_rate' profile = 56.11
Final score for 'htmcore' detector on 'reward_low_FN_rate' profile = 72.08
breznak commented 4 years ago

FYI score using localAreaDensity without any optimization ;) We are getting there!

is this with the fake "spatial anomaly" on? Bcs we've been 'there'. These problems start when we decide to do good and avoid the fake helper. (Interestingly, Numenta does not suffer such a big loss when the spatial anomaly is disabled.)

psteinroe commented 4 years ago

is this with the fake "spatial anomaly" on?

Yes..

These problems start when we decide to do good and avoid the fake helper. (Interestingly, Numenta does not suffer such a big loss when the spatial anomaly is disabled.)

What do you propose how to solve this? Should we continue with optimising with spatial or should we try to see why Numenta still seems to perform slightly better?

breznak commented 4 years ago

What do you propose how to solve this? Should we continue with optimising

ok, maybe we can try spending some time optimizing with the "spatial anomaly", just to see if we can do better than numenta detectors. We're quite close now. That would be one result. (note that numenta detector - not numentaTM uses another TM : BacktrackingTM - which we removed too as unbiological hack.)

When we get some satisfactory scores (we'll enjoy some fame ;) and then) I'd use that as a starting point for figuring settings with the spatialAnomaly OFF.

This would give is 3 kinds of results:

psteinroe commented 4 years ago

Just in case you want to run the optimization with both params: You can use this Dockerfile for using the htm.core version with numActiveColumnsPerInhArea. Kinda hacky but does the job until we merged it (if we merge it).

FROM python:3.7

COPY . /NAB

WORKDIR /NAB

RUN pip install . --user --extra-index-url https://test.pypi.org/simple/
RUN pip uninstall --yes htm.core

WORKDIR /

RUN git clone https://github.com/htm-community/htm.core.git
WORKDIR /htm.core
RUN git checkout --track origin/sp_reintroduce_numActiveColumnsPerInhArea
RUN python -m pip install cmake>=3.10
RUN python setup.py install --user --force

WORKDIR /NAB

CMD python run.py -d htmcore --skipConfirmation --d
breznak commented 4 years ago

You can use this Dockerfile for using the htm.core version with numActiveColumnsPerInhArea. Kinda hacky but does the job until we merged it (if we merge it).

Ok, so the numActiveCols param does make sense. I'll try to merge it, so to simplify experimentation here, and eventually we can disable it again if found not needed.

breznak commented 4 years ago

Just in case you want to run the optimization with both params: You can use this Dockerfile for using the htm.core version with numActiveColumnsPerInhArea

The PR just landed in htmcore master

breznak commented 4 years ago

Scores with random (seed=0), without any further optimizations

 "htmcore": {
-        "reward_low_FN_rate": 76.5626293570994,
-        "reward_low_FP_rate": 61.359926511549155,
-        "standard": 71.3094612770284
+        "reward_low_FN_rate": 73.96624112193469,
+        "reward_low_FP_rate": 59.253001477255935,
+        "standard": 68.40111896667753
breznak commented 4 years ago

Merged master, I hope nothing was lost, there were some larger chunks of params in conflict. (due to the formatting), should be ok.

breznak commented 4 years ago

@Zbysekz the good news is in recent HTMcore build, I think this issue is fixed!! :+1: Will need to investigate why/when, but I get as good as Numenta results without the fake spatial anomalies. Bad news is some intendation changes in a past PR make the git diff confused and I'm failing to merge the master into this PR. If you could please try, or I'll manually copy the HTM_PANDAVIS changes into this PR, and then we'll doublecheck together everything works.

Zbysekz commented 3 years ago

@breznak OK, i take a look at this. Panda remake is now "finished", so now it uses NetworkAPI only.

So this feature what i've done here is dropped, so i'll try to remove it

Zbysekz commented 3 years ago

I merge it and i got

Running score normalization step
Final score for 'htmcore' detector on 'standard' profile = 66.35
Final score for 'htmcore' detector on 'reward_low_FP_rate' profile = 57.11
Final score for 'htmcore' detector on 'reward_low_FN_rate' profile = 71.82
Final scores have been written to /home/ub/HTM/NAB/results/final_results.json.

It is the "good" score?

I can merge it with this branch if you like EDIT: i will commit it there

Zbysekz commented 3 years ago

I don't have write permissions for this NAB repo.

The branch is there: https://github.com/Zbysekz/NAB/tree/fixing_spatial_anomaly

Zbysekz commented 3 years ago

Now i got even better: (assume it is because of seed==0)

Final score for 'htmcore' detector on 'standard' profile = 70.51
Final score for 'htmcore' detector on 'reward_low_FP_rate' profile = 60.67
Final score for 'htmcore' detector on 'reward_low_FN_rate' profile = 75.74
Zbysekz commented 3 years ago

Previous was with spatial anomaly=True,

with false it is:

Final score for 'htmcore' detector on 'standard' profile = 60.91
Final score for 'htmcore' detector on 'reward_low_FP_rate' profile = 52.14
Final score for 'htmcore' detector on 'reward_low_FN_rate' profile = 65.60

Not sure if it should be set to true or false in this branch?

breznak commented 3 years ago

OK, i take a look at this. Panda remake is now "finished", so now it uses NetworkAPI only.

thank you :+1: ad NetworkAPI: Does it mean we won't be able to use it with this detector? Since the detector is not in NAPI.

It is the "good" score?

that was slightly below good :), I think I got same-as Numenta scores. (hope with correct settings too) But I did use the current htmcore master build, not the latest tag from pip (there's the difference in improving changes).

Not sure if it should be set to true or false in this branch?

we want spatial=False here, and everywhere.

breznak commented 3 years ago

@Zbysekz btw, the htmcore detector was not in your PR, so the conflict is still there

Zbysekz commented 3 years ago

ad NetworkAPI: Does it mean we won't be able to use it with this detector? Since the detector is not in NAPI.

Yes, it would be not possible anymore. I coudn't keep Panda in this way. With using NAPI, everything is more general and generated automatically in contrast of manually handing over specific data & instances.

But i can rewrite this nab detector to use NAPI, what i saw it should be quite simple.

Zbysekz commented 3 years ago

btw, the htmcore detector was not in your PR, so the conflict is still there

Hmm there is a bit complication, because i wanted to revert pandaVis changes, but these are not in this branch yet so that is why it was not reflected... i will start new PR and we will just link it from here

breznak commented 3 years ago

But i can rewrite this nab detector to use NAPI, what i saw it should be quite simple.

I was thinking about it...let's keep it for a while as is, and if we stabilize the (top) scores, we can rewrite to NAPI. What I wonder is overhead of the NAPI, for c++->py it's about 30%.

Zbysekz commented 3 years ago

What I wonder is overhead of the NAPI, for c++->py it's about 30%.

What you mean by this overhead of 30% ? Like performance overhead/complexity or line count?

breznak commented 3 years ago

What you mean by this overhead of 30% ?

compute speed overhead.

Zbysekz commented 3 years ago

What you mean by this overhead of 30% ?

compute speed overhead.

Hmm yes 30% overhead is good, i would expect more.

Zbysekz commented 3 years ago

I am getting

Final score for 'htmcore' detector on 'standard' profile = 60.91
Final score for 'htmcore' detector on 'reward_low_FP_rate' profile = 52.14
Final score for 'htmcore' detector on 'reward_low_FN_rate' profile = 65.60

without spatial anomaly and with latest htm.core

gotham29 commented 1 year ago

I am getting

Final score for 'htmcore' detector on 'standard' profile = 60.91
Final score for 'htmcore' detector on 'reward_low_FP_rate' profile = 52.14
Final score for 'htmcore' detector on 'reward_low_FN_rate' profile = 65.60

without spatial anomaly and with latest htm.core

Hi @Zbysekz!

Could I possibly see your implementation at all? I ask because I'm getting NAB score around 50 with my htm.core implmenetation (htm_streamer), and itching to figure out the difference. I know its been a while now but just wondering!

Thanks,

Sam

gotham29 commented 1 year ago

Scores with random (seed=0), without any further optimizations

 "htmcore": {
-        "reward_low_FN_rate": 76.5626293570994,
-        "reward_low_FP_rate": 61.359926511549155,
-        "standard": 71.3094612770284
+        "reward_low_FN_rate": 73.96624112193469,
+        "reward_low_FP_rate": 59.253001477255935,
+        "standard": 68.40111896667753

Hey @breznak!

Could I possibly see your implementation that got these scores if you still have it?? I am trying to validate my own htm.core implementation (htm_streamer) on NAB and only scoring around 50 - and dying to get to the bottom of it. Thank you!

Sam

Zbysekz commented 1 year ago

I am getting

Final score for 'htmcore' detector on 'standard' profile = 60.91
Final score for 'htmcore' detector on 'reward_low_FP_rate' profile = 52.14
Final score for 'htmcore' detector on 'reward_low_FN_rate' profile = 65.60

without spatial anomaly and with latest htm.core

Hi @Zbysekz!

Could I possibly see your implementation at all? I ask because I'm getting NAB score around 50 with my htm.core implmenetation (htm_streamer), and itching to figure out the difference. I know its been a while now but just wondering!

Thanks,

Sam

Hello Sam, I was just debugging connection with pandaVis, i was not improving the NAB scoring. So my setup was just this branch compiled.

What about this? "good as Numenta results without the fake spatial anomalies" as Breznak writes in one comment. I thnik we discuss this somewhere, but it is already so long ago. I am not i this project active right now. Trying to remember, but i think there was some mathematic simplification for calculating spatial anomalies.

Good luck

gotham29 commented 1 year ago

Hi @Zbysekz! Thanks for your reply! Do you mean that you you might've added the spatial anomaly functionality to your htm.core implementation to score > 60? I see in the NumentaTM detector they declare spatial anomaly for anything more than 5% above the max or below the min.

Zbysekz commented 1 year ago

Hi @Zbysekz! Thanks for your reply! Do you mean that you you might've added the spatial anomaly functionality to your htm.core implementation to score > 60? I see in the NumentaTM detector they declare spatial anomaly for anything more than 5% above the max or below the min.

No i just ran latest htm.core with this branch (that is not merged to NAB master) with spatial anomaly off.. I wish i could help more.

gotham29 commented 1 year ago

Hi @Zbysekz! Thanks for your reply! Do you mean that you you might've added the spatial anomaly functionality to your htm.core implementation to score > 60? I see in the NumentaTM detector they declare spatial anomaly for anything more than 5% above the max or below the min.

No i just ran latest htm.core with this branch (that is not merged to NAB master) with spatial anomaly off.. I wish i could help more.

All good man thank you! I now have 60+ NAB score when using spatial anomaly, which is comparable to NuPIC & htm.java so I'm happy! With that done I'm definitely curious to look into your visualization functionality too!

ctrl-z-9000-times commented 1 year ago

Hi @Zbysekz and @gotham29, If you're interested in working on NAB, I will give you write permissions to this repo.

gotham29 commented 1 year ago

Hi @ctrl-z-9000-times, yes thank you that'd be awesome! I'd love to have a look!

ctrl-z-9000-times commented 1 year ago

Ok @gotham29, you should now have write permission for this repo. Feel free to improve this project in whatever way you see fit.