z1q1q7 commented 6 years ago

I'm using nupic.frameworks.opf.modelfactory to train http code data(try to find abnormal case). I have 700 time series and nupic take 60GB. Is this a normal case or do we have any distributed solutions?

rhyolight commented 6 years ago

That is a lot of memory. Seems fishy. How big are the cell structures you're creating? Please post your model parameters.

z1q1q7 commented 6 years ago

Thanks a lot for your reply. my model params as follows:

MODEL_PARAMS = {
    'model': "CLA",
    'version': 1,
    'aggregationInfo': {  'days': 0,
        'fields': [(u'c1', 'sum'), (u'c0', 'first')],
        'hours': 1,
        'microseconds': 0,
        'milliseconds': 0,
        'minutes': 0,
        'months': 0,
        'seconds': 0,
        'weeks': 0,
        'years': 0},
    'predictAheadTime': None,
    'modelParams': {
        'inferenceType': 'TemporalAnomaly',
        'sensorParams': {
            'verbosity' : 0,
            'encoders': {
                u'timestamp_timeOfDay': {
                        'fieldname': u'timestamp',
                        'name': u'timestamp_timeOfDay',
                        'timeOfDay': (21, 9.5),
                        'type': 'DateEncoder'
                },
                u'timestamp_dayOfWeek': None,
                u'timestamp_weekend': None,
                u'ratio':    {
                    'clipInput': True,
                    'fieldname': u'ratio',
                    'n': 50,
                    'name': u'ratio',
                    'type': 'AdaptiveScalarEncoder',
                    'w': 21
                },
            },
            'sensorAutoReset' : None,
        },
        'spEnable': True,
        'spParams': {
            'spVerbosity' : 0,
            'spatialImp' : 'cpp',
            'globalInhibition': 1,
            'columnCount': 2048,
            'inputWidth': 0,
            'numActiveColumnsPerInhArea': 40,
            'seed': 1956,
            'potentialPct': 0.8,
            'synPermConnected': 0.1,
            'synPermActiveInc': 0.0001,
            'synPermInactiveDec': 0.0005,
        },
        'tpEnable' : True,
        'tpParams': {
            'verbosity': 0,
            'columnCount': 2048,
            'cellsPerColumn': 32,
            'inputWidth': 2048,
            'seed': 1960,
            'temporalImp': 'cpp',
            'newSynapseCount': 20,
            'maxSynapsesPerSegment': 32,
            'maxSegmentsPerCell': 128,
            'initialPerm': 0.21,
            'permanenceInc': 0.1,
            'permanenceDec' : 0.1,
            'globalDecay': 0.0,
            'maxAge': 0,
            'minThreshold': 9,
            'activationThreshold': 12,
            'outputType': 'normal',
            'pamLength': 3,
        },
        'clEnable': False,
        'clParams': None,
        'anomalyParams': {  u'anomalyCacheRecords': None,
    u'autoDetectThreshold': None,
    u'autoDetectWaitRecords': 2184},
        'trainSPNetOnlyIfRequested': False,
    },
}

rhyolight commented 6 years ago

The SP inputWidth should not be 0. Are you setting it programmatically? It should be the size of the input array.

z1q1q7 commented 6 years ago

I set it manually. Sorry I'm not familar with these params. I'll make some research on it. Tks for your reply.

rhyolight commented 6 years ago

Count how many bits are in the input encoding before you send it to the SP. That is the inputWidth.

z1q1q7 commented 6 years ago

Hi, rhyolight, it seems that the opf model will take more memory as it runs(data feed to it). Would there may be some memory leak?

z1q1q7 commented 6 years ago

I also have some doubts about nyctaxi anomaly detection in examples/opf/clients. When I run python nyctaxi_anomaly.py, error happens as follows: ERR: Unknown parameter 'boostStrength' for region 'SP' of type 'py.SPRegion'

So I comment this line in model_params.py

"boostStrength": 0.0,

Then the program can run normaly, however, the result is different from README. README says that: The five anomalies occur at the following timestamps, [ "2014-11-01 19:00:00", "2014-11-27 15:30:00", "2014-12-25 15:00:00", "2015-01-01 01:00:00", "2015-01-27 00:00:00" ]

I check 5 timestamp above, my result as follows: 2014-11-01 19:00:00,28398.0,0.150 2014-11-27 15:30:00,15255.0,0.125 2014-12-25 15:00:00,12039.0,0.075 2015-01-01 01:00:00,30236.0,0.875 2015-01-27 00:00:00,109.0,0.125

As we seen, no anomaly score exceeds 0.9. It's because of the param I comment out?

Hope for your answer.

rhyolight commented 6 years ago

Regarding memory the model will increase in size until it reaches a limit of segments and synapses. Then it will stop growing.

Your other problem is because the example code needs to be updated. I've created an issue: https://github.com/numenta/nupic/issues/3802

rhyolight commented 6 years ago

Your other problem is because the example code needs to be updated. I've created an issue: #3802

Whoops I was wrong about that. Make sure you have NuPIC 1.0 or greater installed.

numenta / nupic-legacy

OPF model take too much memory #3781

"boostStrength": 0.0,