Open breznak opened 5 years ago
CC @ctrl-z-9000-times
~~One interesting fact, using synapse competition from https://github.com/htm-community/htm.core/pull/584 leads to better utilization of synapses (probably counters some ill combination of params) which leads to our results getting back to reasonable again.~~
HTMcore without "spatial anomaly" and with "synapse competition":
--- a/results/final_results.json
+++ b/results/final_results.json
@@ -15,9 +15,9 @@
"standard": 16.43666922426724
},
"htmcore": {
- "reward_low_FN_rate": 66.1922106936328,
- "reward_low_FP_rate": 58.7930712907694,
- "standard": 63.081419488725054
+ "reward_low_FN_rate": 59.48811919632727,
+ "reward_low_FP_rate": 53.995064634304576,
+ "standard": 56.90459258759435
},
EDIT: Ok, scratch that. I apparently ran NAB on an incorrect branch, the results are still bad w/ syn_competition and w/o spatial anomaly.
Yea I don't know what to say here. Numenta did several non-biological things in order score higher on the NAB benchmark.
I think we should disregard both the spatial anomaly and the backtracking TM, since both are non-biological methods which improve scores on this particular benchmark (but probably not all benchmarks).
Numenta did several non-biological things in order score higher on the NAB benchmark. ..backtrackingTM
yep, we're catching up to "HTM" with backTM, even Numenta lists the numentaTM
detecot, which uses biological TM and our htmcore detector has results close to that one:
https://github.com/htm-community/htm.core/issues/391
disregard both the spatial anomaly
definitely we should. I was really worried even Numenta's detector suffer the problem (most work done by "spatial anomaly"), fortunately it is not a problem (so the NAB paper is valid!), unfortunately, our TM suffers w/o the spatial anomaly. For param optimization we must not use the "spatial anomaly". I'll adapt the PR not-to use it on our detector (which is the correct approach). Not sure what to do with Numenta's, probably leave as is?
@ctrl-z-9000-times I'll try to revisit this again, the issue is a weird/severe performance regression. Please help me investigating if you have time, thanks
Update:
When running on only the artificial/synthetic labels, our results are quite good:
python run.py -d htmcore --detect --score --optimize --normalize --windowsFile labels/synthetic.json -n 8
htmcore detector benchmark scores written to /mnt/store/devel/HTM/NAB/results/htmcore/htmcore_reward_low_FN_rate_scores.csv
Running score normalization step
Final score for 'htmcore' detector on 'standard' profile = 84.84
Final score for 'htmcore' detector on 'reward_low_FP_rate' profile = 84.36
Final score for 'htmcore' detector on 'reward_low_FN_rate' profile = 88.37
Final scores have been written to /mnt/store/devel/HTM/NAB/results/final_results.json.
@steinroe please have a look at #26 and we can merge it here.
FYI score using localAreaDensity without any optimization ;) We are getting there!
Final score for 'htmcore' detector on 'standard' profile = 66.30
Final score for 'htmcore' detector on 'reward_low_FP_rate' profile = 56.11
Final score for 'htmcore' detector on 'reward_low_FN_rate' profile = 72.08
FYI score using localAreaDensity without any optimization ;) We are getting there!
is this with the fake "spatial anomaly" on? Bcs we've been 'there'. These problems start when we decide to do good and avoid the fake helper. (Interestingly, Numenta does not suffer such a big loss when the spatial anomaly is disabled.)
is this with the fake "spatial anomaly" on?
Yes..
These problems start when we decide to do good and avoid the fake helper. (Interestingly, Numenta does not suffer such a big loss when the spatial anomaly is disabled.)
What do you propose how to solve this? Should we continue with optimising with spatial or should we try to see why Numenta still seems to perform slightly better?
What do you propose how to solve this? Should we continue with optimising
ok, maybe we can try spending some time optimizing with the "spatial anomaly", just to see if we can do better than numenta detectors. We're quite close now. That would be one result.
(note that numenta
detector - not numentaTM
uses another TM : BacktrackingTM - which we removed too as unbiological hack.)
When we get some satisfactory scores (we'll enjoy some fame ;) and then) I'd use that as a starting point for figuring settings with the spatialAnomaly OFF.
This would give is 3 kinds of results:
Just in case you want to run the optimization with both params: You can use this Dockerfile for using the htm.core version with numActiveColumnsPerInhArea
. Kinda hacky but does the job until we merged it (if we merge it).
FROM python:3.7
COPY . /NAB
WORKDIR /NAB
RUN pip install . --user --extra-index-url https://test.pypi.org/simple/
RUN pip uninstall --yes htm.core
WORKDIR /
RUN git clone https://github.com/htm-community/htm.core.git
WORKDIR /htm.core
RUN git checkout --track origin/sp_reintroduce_numActiveColumnsPerInhArea
RUN python -m pip install cmake>=3.10
RUN python setup.py install --user --force
WORKDIR /NAB
CMD python run.py -d htmcore --skipConfirmation --d
You can use this Dockerfile for using the htm.core version with numActiveColumnsPerInhArea. Kinda hacky but does the job until we merged it (if we merge it).
Ok, so the numActiveCols
param does make sense. I'll try to merge it, so to simplify experimentation here, and eventually we can disable it again if found not needed.
Just in case you want to run the optimization with both params: You can use this Dockerfile for using the htm.core version with numActiveColumnsPerInhArea
The PR just landed in htmcore master
Scores with random (seed=0), without any further optimizations
"htmcore": {
- "reward_low_FN_rate": 76.5626293570994,
- "reward_low_FP_rate": 61.359926511549155,
- "standard": 71.3094612770284
+ "reward_low_FN_rate": 73.96624112193469,
+ "reward_low_FP_rate": 59.253001477255935,
+ "standard": 68.40111896667753
Merged master, I hope nothing was lost, there were some larger chunks of params in conflict. (due to the formatting), should be ok.
@Zbysekz the good news is in recent HTMcore build, I think this issue is fixed!! :+1: Will need to investigate why/when, but I get as good as Numenta results without the fake spatial anomalies. Bad news is some intendation changes in a past PR make the git diff confused and I'm failing to merge the master into this PR. If you could please try, or I'll manually copy the HTM_PANDAVIS changes into this PR, and then we'll doublecheck together everything works.
@breznak OK, i take a look at this. Panda remake is now "finished", so now it uses NetworkAPI only.
So this feature what i've done here is dropped, so i'll try to remove it
I merge it and i got
Running score normalization step
Final score for 'htmcore' detector on 'standard' profile = 66.35
Final score for 'htmcore' detector on 'reward_low_FP_rate' profile = 57.11
Final score for 'htmcore' detector on 'reward_low_FN_rate' profile = 71.82
Final scores have been written to /home/ub/HTM/NAB/results/final_results.json.
It is the "good" score?
I can merge it with this branch if you like EDIT: i will commit it there
I don't have write permissions for this NAB repo.
The branch is there: https://github.com/Zbysekz/NAB/tree/fixing_spatial_anomaly
Now i got even better: (assume it is because of seed==0)
Final score for 'htmcore' detector on 'standard' profile = 70.51
Final score for 'htmcore' detector on 'reward_low_FP_rate' profile = 60.67
Final score for 'htmcore' detector on 'reward_low_FN_rate' profile = 75.74
Previous was with spatial anomaly=True,
with false it is:
Final score for 'htmcore' detector on 'standard' profile = 60.91
Final score for 'htmcore' detector on 'reward_low_FP_rate' profile = 52.14
Final score for 'htmcore' detector on 'reward_low_FN_rate' profile = 65.60
Not sure if it should be set to true or false in this branch?
OK, i take a look at this. Panda remake is now "finished", so now it uses NetworkAPI only.
thank you :+1: ad NetworkAPI: Does it mean we won't be able to use it with this detector? Since the detector is not in NAPI.
It is the "good" score?
that was slightly below good :), I think I got same-as Numenta scores. (hope with correct settings too) But I did use the current htmcore master build, not the latest tag from pip (there's the difference in improving changes).
Not sure if it should be set to true or false in this branch?
we want spatial=False here, and everywhere.
@Zbysekz btw, the htmcore detector was not in your PR, so the conflict is still there
ad NetworkAPI: Does it mean we won't be able to use it with this detector? Since the detector is not in NAPI.
Yes, it would be not possible anymore. I coudn't keep Panda in this way. With using NAPI, everything is more general and generated automatically in contrast of manually handing over specific data & instances.
But i can rewrite this nab detector to use NAPI, what i saw it should be quite simple.
btw, the htmcore detector was not in your PR, so the conflict is still there
Hmm there is a bit complication, because i wanted to revert pandaVis changes, but these are not in this branch yet so that is why it was not reflected... i will start new PR and we will just link it from here
But i can rewrite this nab detector to use NAPI, what i saw it should be quite simple.
I was thinking about it...let's keep it for a while as is, and if we stabilize the (top) scores, we can rewrite to NAPI. What I wonder is overhead of the NAPI, for c++->py it's about 30%.
What I wonder is overhead of the NAPI, for c++->py it's about 30%.
What you mean by this overhead of 30% ? Like performance overhead/complexity or line count?
What you mean by this overhead of 30% ?
compute speed overhead.
What you mean by this overhead of 30% ?
compute speed overhead.
Hmm yes 30% overhead is good, i would expect more.
I am getting
Final score for 'htmcore' detector on 'standard' profile = 60.91
Final score for 'htmcore' detector on 'reward_low_FP_rate' profile = 52.14
Final score for 'htmcore' detector on 'reward_low_FN_rate' profile = 65.60
without spatial anomaly and with latest htm.core
I am getting
Final score for 'htmcore' detector on 'standard' profile = 60.91 Final score for 'htmcore' detector on 'reward_low_FP_rate' profile = 52.14 Final score for 'htmcore' detector on 'reward_low_FN_rate' profile = 65.60
without spatial anomaly and with latest htm.core
Hi @Zbysekz!
Could I possibly see your implementation at all? I ask because I'm getting NAB score around 50 with my htm.core implmenetation (htm_streamer), and itching to figure out the difference. I know its been a while now but just wondering!
Thanks,
Sam
Scores with random (seed=0), without any further optimizations
"htmcore": { - "reward_low_FN_rate": 76.5626293570994, - "reward_low_FP_rate": 61.359926511549155, - "standard": 71.3094612770284 + "reward_low_FN_rate": 73.96624112193469, + "reward_low_FP_rate": 59.253001477255935, + "standard": 68.40111896667753
Hey @breznak!
Could I possibly see your implementation that got these scores if you still have it?? I am trying to validate my own htm.core implementation (htm_streamer) on NAB and only scoring around 50 - and dying to get to the bottom of it. Thank you!
Sam
I am getting
Final score for 'htmcore' detector on 'standard' profile = 60.91 Final score for 'htmcore' detector on 'reward_low_FP_rate' profile = 52.14 Final score for 'htmcore' detector on 'reward_low_FN_rate' profile = 65.60
without spatial anomaly and with latest htm.core
Hi @Zbysekz!
Could I possibly see your implementation at all? I ask because I'm getting NAB score around 50 with my htm.core implmenetation (htm_streamer), and itching to figure out the difference. I know its been a while now but just wondering!
Thanks,
Sam
Hello Sam, I was just debugging connection with pandaVis, i was not improving the NAB scoring. So my setup was just this branch compiled.
What about this? "good as Numenta results without the fake spatial anomalies" as Breznak writes in one comment. I thnik we discuss this somewhere, but it is already so long ago. I am not i this project active right now. Trying to remember, but i think there was some mathematic simplification for calculating spatial anomalies.
Good luck
Hi @Zbysekz! Thanks for your reply! Do you mean that you you might've added the spatial anomaly functionality to your htm.core implementation to score > 60? I see in the NumentaTM detector they declare spatial anomaly for anything more than 5% above the max or below the min.
Hi @Zbysekz! Thanks for your reply! Do you mean that you you might've added the spatial anomaly functionality to your htm.core implementation to score > 60? I see in the NumentaTM detector they declare spatial anomaly for anything more than 5% above the max or below the min.
No i just ran latest htm.core with this branch (that is not merged to NAB master) with spatial anomaly off.. I wish i could help more.
Hi @Zbysekz! Thanks for your reply! Do you mean that you you might've added the spatial anomaly functionality to your htm.core implementation to score > 60? I see in the NumentaTM detector they declare spatial anomaly for anything more than 5% above the max or below the min.
No i just ran latest htm.core with this branch (that is not merged to NAB master) with spatial anomaly off.. I wish i could help more.
All good man thank you! I now have 60+ NAB score when using spatial anomaly, which is comparable to NuPIC & htm.java so I'm happy! With that done I'm definitely curious to look into your visualization functionality too!
Hi @Zbysekz and @gotham29, If you're interested in working on NAB, I will give you write permissions to this repo.
Hi @ctrl-z-9000-times, yes thank you that'd be awesome! I'd love to have a look!
Ok @gotham29, you should now have write permission for this repo. Feel free to improve this project in whatever way you see fit.
So, we have a problem.. hopefully a combination of the params for HTMcore is not optimal.
The Numenta detectors have an "artificial" way of detecting "spatial anomaly" (which is a running upper/lower bounds threshold detector)
If I disable such (fake) detector for our
htmcore
the scores become pretty bad:This means most of the work in htmcore with these settings is done in the threshold detector.
Luckily, the numenta detectors & settings do not suffer this drop when spatial-threshold is removed.
TL;DR: we must optimize our net without this "spatial anomaly" threshold-detector.
For #6