numenta / nupic-legacy

Numenta Platform for Intelligent Computing is an implementation of Hierarchical Temporal Memory (HTM), a theory of intelligence based strictly on the neuroscience of the neocortex.
http://numenta.org/
GNU Affero General Public License v3.0
6.33k stars 1.56k forks source link

Anomalylikelihood write method throws KeyError distributionParams vs distribution #3783

Closed ghost closed 6 years ago

ghost commented 6 years ago
Traceback (most recent call last):
  File "run.py", line 21, in <module>
    model.write(builder)
  File "/usr/local/lib/python2.7/site-packages/nupic/algorithms/anomaly_likelihood.py", line 321, in write
    proto.distribution.name = self._distribution["distributionParams"]["name"]
KeyError: 'distributionParams'

When I log this method, I can see the following:


{'distribution': {'mean': 0.66554064258358092,
                  'name': 'normal',
                  'stdev': 0.02710835571806828,
                  'variance': 0.00073486294973732511},
 'historicalLikelihoods': [0.08321225454772424,
                           0.42466239667745609,
                           0.41143882167909385,
                           0.38278323468141645,
                           0.097795022725065406,
                           0.42578440342970869,
                           0.47160308710543392,
                           0.22586380006966755,
                           0.43450092587880484,
                           0.29902835722605992],
 'movingAverage': {'historicalValues': [0.65061007991758246,
                                        0.92661753892022314,
                                        0.5091802474129109,
                                        0.52021695705560211,
                                        0.54556075732431397,
                                        0.91294922102966003,
                                        0.5928150752511081,
                                        0.5054987548601162,
                                        0.78161115646868851,
                                        0.56743353987350631],
                   'total': 6.5124933281137114,
                   'windowSize': 10}}

In the proto object the defined key is "distribution" too, not "distributionParams".

The capnproto schema also uses Float32 (a float) but the incoming python "float" is actually a double. (Should this be Float64?)

@0x8602f38429407eb0;

struct AnomalyLikelihoodRegionProto {
  iteration @0 :UInt64;
  historicalScores @1 :List(Score);
  distribution @2 :Distribution;
  probationaryPeriod @3 :UInt32;
  learningPeriod @4 :UInt32;
  reestimationPeriod @5 :UInt32;
  historicWindowSize @6 :UInt32;

  struct Score {
    value @0 :Float32;
    anomalyScore @1 :Float32;
  }

  struct Distribution {
    name @0 :Text;
    mean @1 :Float32;
    variance @2 :Float32;
    stdev @3 :Float32;
    movingAverage @4 :MovingAverage;
    historicalLikelihoods @5 :List(Float32);

    struct MovingAverage {
      windowSize @0 :UInt64;
      historicalValues @1 :List(Float32);
      total @2 :Float32;
    }
  }
}
rhyolight commented 6 years ago

@subutai Can you take a look at this issue please just to make sure it makes sense?

subutai commented 6 years ago

It makes sense but should be handled by someone who understands the Capn Proto implementation. I also agree with @kyle-sorensen that the Float32's should really be Float64's.

To prevent future occurrences, sounds like we need a test that writes and loads an anomaly model??

lscheinkman commented 6 years ago

@rhyolight, @subutai I can take a look.

lscheinkman commented 6 years ago

@rhyolight I've fixed the serialization test to include the distribution (See #3786). @kyle-sorensen PR #3784 fixes this issue, I ran this PR with the new test to validate.