htm-community / NAB

The Numenta Anomaly Benchmark
GNU Affero General Public License v3.0
3 stars 3 forks source link

optimization results #32

Closed psteinroe closed 4 years ago

psteinroe commented 4 years ago

I optimized overnight (using my MacBook, so not much processing power) with two processes in parallel - on using localAreaDensity and one using numActiveCols and a fixed seed of 5.

Two things that I find interesting:

  1. numActiveCols seems to superior. Max standard score was 71, while localAreaDensity never reached 70.
  2. When removing the fixed seed the standard score drops to 67.88. It seems like randomness plays a big role. I think adding seed as one of the params to optimise (instead of setting it fix to 5 during the optimization) might further increase the score but in my opinion this would be bad practice...
  3. Maybe, instead of returning the standard score during the optimization, we could try to return the mean of all three scores?

Edit: Results are

"reward_low_FN_rate": 76.5626293570994,
"reward_low_FP_rate": 61.359926511549155,
"standard": 71.3094612770284
breznak commented 4 years ago

These are very nice scores for HTMcore!!

Results are "reward_low_FN_rate": 76.5626293570994, "reward_low_FP_rate": 61.359926511549155, "standard": 71.3094612770284

Compared to

Numenta HTM* 70.5-69.7 62.6-61.7 75.2-74.2 numenta

Numenta HTM using NuPIC v0.5.6* 70.1 63.1 74.3

NumentaTM HTM* 64.6 56.7 69.2 (aka our type of TM used)

Numenta HTM*, no likelihood 53.62 34.15 61.89

So we could say we're the winners now! :100: Best HTM model score on NAB dataset (*and now we have some new features in the sleeve, just were held back because "how does it affect performance". And we can reliably answer that now!)

But...

numActiveCols seems to superior. Max standard score was 71, while localAreaDensity never reached 70.

ok, but it's a close call. 1% shouldn't be that important. Also, as I understand it, this is just 1-param opt, right? I'll get your framework running, and then try running it on a cluster as well.

When removing the fixed seed the standard score drops to 67.88. It seems like randomness plays a big role. I think adding seed as one of the params to optimise (instead of setting it fix to 5 during the optimization) might further increase the score but in my opinion this would be bad practice...

this is a bad thing. It should never be so sensitive to the rng seed! I'm wondering if the dataset is not good, being so sensitive to overfitting. Or if there could be a bug in our algos that handles fixed seed somehow differently. Curiosity: are the good results only if rng seed is "5"? Or you get same score for, say 42? Or same for 42 after re-tuning? But to conclude, we want results with random seed (ie not specified or set to the special value that means "completely random"). The scores might be worse, but the results would be corresponding to general reality/performance on any dataset.

we could try to return the mean of all three scores?

TBH, I don't know how exacly the scores are computed, but the "standard" should be just that. Some balance between low FP, low FN.

psteinroe commented 4 years ago

I will rerun the optimization tonight without setting the seed parameter to see how it influences the results. Numenta did set the seed to a fixed value check here for their models, but I fully agree that this is bad practice.

Or if there could be a bug in our algos that handles fixed seed somehow differently.

We should check if setting no seed means using a random one for TM, SP as well as the RSDE Encoder to be sure.

breznak commented 4 years ago

I will rerun the optimization tonight without setting the seed parameter to see how it influences the results.

I want to play around, so I suggest we'll just merge results and correct/update it later.

We should check if setting no seed means using a random one for TM, SP as well as the RSDE Encoder to be sure.

I can do that but I'm quite sure it defaults to random seed. What I'm wondering is why would a fixed seed have such a great effect? The sequence should still be pseudo-random. My intuition would be: Say we're walking all cells/columns in a layer in a for-loop:

breznak commented 4 years ago

We should check if setting no seed means using a random one for TM, SP as well as the RSDE Encoder to be sure.

Turns out default params were set to default to fixed. I have a PR that changes that, just need to iron out all the determinism tests.

A quick workaround would be to force seed=0 everywhere, that means random random.

psteinroe commented 4 years ago

we could use "random-seeded random"

This sounds like the right way to do it. Can we implement that? Or does it behave like that already?

I have a PR that changes that, just need to iron out all the determinism tests.

Very nice, thanks!!

A quick workaround would be to force seed=0 everywhere, that means random random.

Alright, I will do that for tonights run.