numenta / nupic-legacy

Numenta Platform for Intelligent Computing is an implementation of Hierarchical Temporal Memory (HTM), a theory of intelligence based strictly on the neuroscience of the neocortex.
http://numenta.org/
GNU Affero General Public License v3.0
6.34k stars 1.56k forks source link

NontemporalClassifier learns even when field to be predicted is None #2117

Open mj-harvey opened 9 years ago

mj-harvey commented 9 years ago

Hi,

I've constructed a spatial classifer following the instructions at https://github.com/numenta/nupic/wiki/Spatial-Classification It's a trivial binary classifier intended to discriminate between even and odd numbers. I train it with:

for r in range(10000):
    ip =  random_integers( low=0, high=99 );
    out = "EVEN"
    if( ip % 2 ):
        out="ODD"
    res = model.run( { "letter": "A", "number": ip, "categoryx": out } )

"letter" and "number" are the inputs. "categoryx" is the classification. After training, I then perform repeated classification of a specific input with

for z in range(100)
    res=model.run( { "letter": "A", "number":  37, "categoryx": None } )
    print  res.inferences["multiStepBestPredictions"][0], res.inferences 

From the output, it's clear from the output that the model is learning the "None" association:

 === CLASSIFY
A37 EVEN {'multiStepPredictions': {0: {'EVEN': 0.5197737848203754, 'ODD': 0.48022621517962466}}, 'multiStepBestPredictions': {0: 'EVEN'}, 'anomalyScore': None}
A37 0 {'multiStepPredictions': {0: {0: 0.66099055865179401, 'EVEN': 0.1889824681780789, 'ODD': 0.15002697317012725}}, 'multiStepBestPredictions': {0: 0}, 'anomalyScore': None}
A37 0 {'multiStepPredictions': {0: {0: 0.79345635419945659, 'EVEN': 0.11585672482639273, 'ODD': 0.090686920974150681}}, 'multiStepBestPredictions': {0: 0}, 'anomalyScore': None}
A37 0 {'multiStepPredictions': {0: {0: 0.85400649584353638, 'EVEN': 0.082147600691117525, 'ODD': 0.063845903465345938}}, 'multiStepBestPredictions': {0: 0}, 'anomalyScore': None}
....
A37 0 {'multiStepPredictions': {0: {0: 0.99997696820837501}}, 'multiStepBestPredictions': {0: 0}, 'anomalyScore': None}

There's a minimal reproducer in: https://github.com/mj-harvey/nupic-noodle/tree/master/test2 where you'll be able to see the model_params: https://github.com/mj-harvey/nupic-noodle/blob/master/test2/model_params.py

Turning up the debug encoder debug, I see the following during training:

RecordSensor got data: {'categoryx': 'EVEN', '_category': None, 'number': 82, '_sequenceId': 0, 'letter': 'A', '_reset': 0}

   2309: array('c', '.......................................................................................................................................................................................................................................................................................................................*********************................................................................') |
     nz: (21) [311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328
 329 330 331]
  encIn: number:82.00
   data: {'categoryx': 'EVEN', '_category': None, 'number': 82, '_sequenceId': 0, 'letter': 'A', '_reset': 0}
decoded: number:[82.10]

"_category" looks suspicious, and suggestive of the "categoryx" field not being correctly assigned as the classifier output.

Any suggestions? Quite prepared to believe I done something wrong in the code, but, if so, the documentation is out-of-date.

rhyolight commented 9 years ago

@breznak What do you think about this?

mj-harvey commented 9 years ago

Hi, would it be more appropriate to post this as a question on the mailing list, rather than as a github issue?

rhyolight commented 9 years ago

@mj-harvey If you think the current behavior is incorrect, an issue is appropriate. If you're not sure, or just wondering what is going on, the mailing list is better.