calumroy / HTM

HTM
3 stars 0 forks source link

predictiveCells Calculator profiling #25

Open calumroy opened 7 years ago

calumroy commented 7 years ago

The current calcualtor that calculates the predictive cells is just a numpy implementation. This calculators function updatePredictiveState uses about half the total calcualtion time for a HTM step.

See below profile

Number of TimeSteps=5
------------------------------------------
NEW TimeStep
PART 1 Update Input
PART 2 Update HTM
         233216 function calls in 0.288 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.288    0.288 HTM_network.py:1002(spatialTemporal)
        1    0.000    0.000    0.000    0.000 HTM_network.py:1105(updateHTMInput)
        1    0.000    0.000    0.288    0.288 HTM_network.py:1173(spatialTemporal)
        3    0.000    0.000    0.000    0.000 HTM_network.py:599(getPotentialOverlaps)
        3    0.000    0.000    0.000    0.000 HTM_network.py:662(updateInput)
        3    0.000    0.000    0.000    0.000 HTM_network.py:673(updateOutput)
        3    0.000    0.000    0.018    0.006 HTM_network.py:716(Overlap)
        3    0.000    0.000    0.019    0.006 HTM_network.py:738(inhibition)
        3    0.000    0.000    0.038    0.013 HTM_network.py:754(spatialLearning)
        3    0.000    0.000    0.147    0.049 HTM_network.py:765(sequencePooler)
        3    0.000    0.000    0.016    0.005 HTM_network.py:777(calcActiveCells)
        3    0.000    0.000    0.122    0.041 HTM_network.py:797(calcPredictCells)
        3    0.000    0.000    0.009    0.003 HTM_network.py:809(sequenceLearning)
        3    0.000    0.000    0.065    0.022 HTM_network.py:823(temporalPooler)
        1    0.000    0.000    0.000    0.000 HTM_network.py:951(updateRegionInput)
        4    0.000    0.000    0.000    0.000 _methods.py:37(_any)
       20    0.000    0.000    0.000    0.000 arraypad.py:101(<genexpr>)
       20    0.000    0.000    0.000    0.000 arraypad.py:1069(<genexpr>)
        2    0.000    0.000    0.000    0.000 arraypad.py:1072(_validate_lengths)
        8    0.000    0.000    0.000    0.000 arraypad.py:111(_append_const)
        2    0.000    0.000    0.000    0.000 arraypad.py:1117(pad)
       20    0.000    0.000    0.000    0.000 arraypad.py:135(<genexpr>)
        8    0.000    0.000    0.000    0.000 arraypad.py:77(_prepend_const)
        4    0.000    0.000    0.000    0.000 arraypad.py:989(_normalize_shape)
       15    0.000    0.000    0.002    0.000 cc.py:1525(__call__)
        2    0.000    0.000    0.000    0.000 fromnumeric.py:2767(round_)
        8    0.000    0.000    0.000    0.000 fromnumeric.py:43(_wrapit)
        8    0.000    0.000    0.000    0.000 fromnumeric.py:823(argsort)
       24    0.014    0.001    0.018    0.001 function_module.py:482(__call__)
        3    0.000    0.000    0.000    0.000 link.py:324(__get__)
        3    0.000    0.000    0.000    0.000 np_activeCells.py:213(getCurrentLearnCellsList)
        3    0.000    0.000    0.000    0.000 np_activeCells.py:221(getActiveCellsList)
        3    0.000    0.000    0.000    0.000 np_activeCells.py:225(getSegUpdates)
       96    0.005    0.000    0.005    0.000 np_activeCells.py:230(findNumSegs)
       32    0.000    0.000    0.001    0.000 np_activeCells.py:245(getSegmentActiveSynapses)
       32    0.000    0.000    0.010    0.000 np_activeCells.py:266(getBestMatchingCell)
       32    0.001    0.000    0.001    0.000 np_activeCells.py:334(newRandomPrevActiveSynapses)
      131    0.001    0.000    0.001    0.000 np_activeCells.py:359(findLeastUsedSeg)
        4    0.000    0.000    0.000    0.000 np_activeCells.py:385(checkColBursting)
        4    0.000    0.000    0.000    0.000 np_activeCells.py:412(findLearnCell)
      108    0.000    0.000    0.000    0.000 np_activeCells.py:421(setActiveCell)
       36    0.000    0.000    0.000    0.000 np_activeCells.py:433(setLearnCell)
      362    0.000    0.000    0.000    0.000 np_activeCells.py:445(checkCellActive)
        4    0.000    0.000    0.000    0.000 np_activeCells.py:458(checkCellLearn)
       96    0.000    0.000    0.000    0.000 np_activeCells.py:468(checkCellPredicting)
     1920    0.005    0.000    0.006    0.000 np_activeCells.py:495(segmentNumSynapsesActive)
      192    0.001    0.000    0.007    0.000 np_activeCells.py:521(getBestMatchingSegment)
        3    0.000    0.000    0.004    0.001 np_activeCells.py:552(updateActiveCellScores)
        3    0.001    0.000    0.016    0.005 np_activeCells.py:582(updateActiveCells)
      496    0.018    0.000    0.018    0.000 np_inhibition.py:270(calcualteInhibition)
        3    0.001    0.000    0.019    0.006 np_inhibition.py:333(calculateWinningCols)
    14090    0.032    0.000    0.034    0.000 np_learning.py:67(updatePermanence)
        3    0.004    0.001    0.038    0.013 np_learning.py:78(updatePermanenceValues)
        3    0.000    0.000    0.000    0.000 np_predictCells.py:117(getActiveSegTimes)
        3    0.000    0.000    0.000    0.000 np_predictCells.py:122(getSegUpdates)
      960    0.001    0.000    0.001    0.000 np_predictCells.py:177(checkCellActive)
    30690    0.084    0.000    0.091    0.000 np_predictCells.py:190(segmentNumSynapsesActive)
        3    0.031    0.010    0.122    0.041 np_predictCells.py:210(updatePredictiveState)
       64    0.001    0.000    0.001    0.000 np_sequenceLearning.py:101(updateCurrentSegSyn)
       32    0.000    0.000    0.002    0.000 np_sequenceLearning.py:137(adaptSegments)
     6174    0.006    0.000    0.006    0.000 np_sequenceLearning.py:168(checkCellTime)
        3    0.002    0.001    0.009    0.003 np_sequenceLearning.py:182(sequenceLearning)
       32    0.001    0.000    0.001    0.000 np_sequenceLearning.py:78(addNewSegSyn)
       19    0.000    0.000    0.000    0.000 np_temporal.py:116(setLearnCell)
     6138    0.007    0.000    0.007    0.000 np_temporal.py:126(checkCellPredict)
     4092    0.002    0.000    0.011    0.000 np_temporal.py:139(checkCellActivePredict)
    13887    0.014    0.000    0.037    0.000 np_temporal.py:149(checkColBursting)
        2    0.000    0.000    0.001    0.000 np_temporal.py:280(getPrev2NewLearnCells)
        3    0.006    0.002    0.043    0.014 np_temporal.py:365(updateProximalTempPool)
        3    0.005    0.002    0.022    0.007 np_temporal.py:428(updateDistalTempPool)
       22    0.000    0.000    0.000    0.000 np_temporal.py:84(checkCellLearn)
    33964    0.028    0.000    0.028    0.000 np_temporal.py:94(checkCellActive)
        2    0.000    0.000    0.000    0.000 numeric.py:141(ones)
       26    0.000    0.000    0.001    0.000 numeric.py:406(asarray)
        6    0.000    0.000    0.000    0.000 numeric.py:476(asanyarray)
        9    0.000    0.000    0.000    0.000 numeric.py:79(zeros_like)
       15    0.000    0.000    0.002    0.000 op.py:742(rval)
      320    0.000    0.000    0.001    0.000 random.py:293(sample)
       12    0.000    0.000    0.001    0.000 safe_asarray.py:12(_asarray)
        1    0.000    0.000    0.000    0.000 sdrFunctions.py:29(joinInputArrays)
        6    0.000    0.000    0.000    0.000 shape_base.py:113(atleast_3d)
        2    0.000    0.000    0.000    0.000 shape_base.py:319(dstack)
        3    0.000    0.000    0.000    0.000 theano_overlap.py:304(checkNewInputParams)
        2    0.000    0.000    0.000    0.000 theano_overlap.py:314(addPaddingToInput)
        3    0.000    0.000    0.002    0.001 theano_overlap.py:458(addVectTieBreaker)
        3    0.000    0.000    0.005    0.002 theano_overlap.py:463(maskTieBreaker)
        3    0.000    0.000    0.001    0.000 theano_overlap.py:476(getColInputs)
        3    0.000    0.000    0.000    0.000 theano_overlap.py:522(getPotentialOverlaps)
        3    0.000    0.000    0.016    0.005 theano_overlap.py:528(calculateOverlap)
        3    0.000    0.000    0.002    0.001 theano_overlap.py:564(removeSmallOverlaps)
       36    0.000    0.000    0.000    0.000 type.py:385(<lambda>)
       36    0.000    0.000    0.001    0.000 type.py:67(filter)
        3    0.000    0.000    0.002    0.001 vm.py:204(__call__)
       15    0.002    0.000    0.002    0.000 {cutils_ext.cutils_ext.run_cthunk}
       56    0.000    0.000    0.000    0.000 {getattr}
       24    0.000    0.000    0.000    0.000 {hasattr}
       38    0.000    0.000    0.000    0.000 {isinstance}
    49258    0.002    0.000    0.002    0.000 {len}
        8    0.000    0.000    0.000    0.000 {math.ceil}
      116    0.000    0.000    0.000    0.000 {math.floor}
    13864    0.002    0.000    0.002    0.000 {max}
        4    0.000    0.000    0.000    0.000 {method 'any' of 'numpy.ndarray' objects}
      215    0.000    0.000    0.000    0.000 {method 'append' of 'list' objects}
        8    0.000    0.000    0.000    0.000 {method 'argsort' of 'numpy.ndarray' objects}
        2    0.000    0.000    0.000    0.000 {method 'astype' of 'numpy.ndarray' objects}
        2    0.000    0.000    0.000    0.000 {method 'copy' of 'numpy.ndarray' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        6    0.000    0.000    0.000    0.000 {method 'flatten' of 'numpy.ndarray' objects}
      320    0.000    0.000    0.000    0.000 {method 'random' of '_random.Random' objects}
        2    0.000    0.000    0.000    0.000 {method 'ravel' of 'numpy.ndarray' objects}
        4    0.000    0.000    0.000    0.000 {method 'reduce' of 'numpy.ufunc' objects}
        9    0.000    0.000    0.000    0.000 {method 'reshape' of 'numpy.ndarray' objects}
        2    0.000    0.000    0.000    0.000 {method 'round' of 'numpy.ndarray' objects}
        2    0.000    0.000    0.000    0.000 {method 'setdefault' of 'dict' objects}
       10    0.000    0.000    0.000    0.000 {method 'tolist' of 'numpy.ndarray' objects}
      546    0.000    0.000    0.000    0.000 {min}
       39    0.001    0.000    0.001    0.000 {numpy.core.multiarray.array}
       10    0.000    0.000    0.000    0.000 {numpy.core.multiarray.concatenate}
       11    0.000    0.000    0.000    0.000 {numpy.core.multiarray.copyto}
        9    0.000    0.000    0.000    0.000 {numpy.core.multiarray.empty_like}
        2    0.000    0.000    0.000    0.000 {numpy.core.multiarray.empty}
        2    0.000    0.000    0.000    0.000 {numpy.core.multiarray.unravel_index}
       52    0.000    0.000    0.000    0.000 {numpy.core.multiarray.zeros}
    54022    0.007    0.000    0.007    0.000 {range}
       96    0.000    0.000    0.000    0.000 {time.time}
       29    0.000    0.000    0.000    0.000 {zip}
calumroy commented 7 years ago

Profiling just the np_predictCells calculator's function updatePredictiveState (note this is from a different test to the above profile hence the 50ms time vs the 122ms above).

         38140 function calls in 0.050 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        4    0.000    0.000    0.000    0.000 np_predictCells.py:127(getSegmentActiveSynapses)
        4    0.000    0.000    0.000    0.000 np_predictCells.py:150(checkCellPredicting)
        4    0.000    0.000    0.000    0.000 np_predictCells.py:160(setPredictCell)
        4    0.000    0.000    0.000    0.000 np_predictCells.py:170(setActiveSeg)
     6051    0.007    0.000    0.007    0.000 np_predictCells.py:177(checkCellActive)
    10230    0.032    0.000    0.040    0.000 np_predictCells.py:190(segmentNumSynapsesActive)
        1    0.010    0.010    0.050    0.050 np_predictCells.py:209(updatePredictiveState)
    10238    0.000    0.000    0.000    0.000 {len}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        4    0.000    0.000    0.000    0.000 {numpy.core.multiarray.zeros}
    11599    0.002    0.000    0.002    0.000 {range}

The above profile was taken form a HTM layer with the following parameters

'desiredLocalActivity': 1,
                                'minOverlap': 2,
                                'wrapInput':1,
                                'inhibitionWidth': 30,
                                'inhibitionHeight': 2,
                                'centerPotSynapses': 1,
                                'connectPermanence': 0.3,
                                'potentialWidth': 34,
                                'potentialHeight': 31,
                                'spatialPermanenceInc': 0.1,
                                'spatialPermanenceDec': 0.01,
                                'activeColPermanenceDec': 0.0,
                                'tempDelayLength': 10,
                                'permanenceInc': 0.15,
                                'permanenceDec': 0.05,
                                'tempSpatialPermanenceInc': 0.04,
                                'tempSeqPermanenceInc': 0.1,
                                'minThreshold': 5,
                                'minScoreThreshold': 3,
                                'newSynapseCount': 10,
                                'maxNumSegments': 10,
                                'activationThreshold': 6,
                                'colSynPermanence': 0.1,
                                'cellSynPermanence': 0.4
calumroy commented 7 years ago

Most time is spent iterating through the distal synapses and counting the number that are connected and linked to an active cell. The run time may be improved by implementing this class in a theano based calculator classed so this can be run on a GPU.

calumroy commented 7 years ago

the theano_predictCells calculator was implemented and run with the same config for a particular layer. The same function in the new calculator class was profiled.

353 function calls in 0.004 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.000    0.000 basic.py:4352(perform)
        3    0.001    0.000    0.004    0.001 function_module.py:482(__call__)
        1    0.000    0.000    0.000    0.000 function_module.py:691(free)
       13    0.000    0.000    0.000    0.000 numeric.py:406(asarray)
        2    0.000    0.000    0.002    0.001 op.py:767(rval)
       13    0.000    0.000    0.000    0.000 safe_asarray.py:12(_asarray)
        1    0.000    0.000    0.001    0.001 scan_op.py:638(<lambda>)
        1    0.000    0.000    0.001    0.001 scan_op.py:670(rval)
        1    0.002    0.002    0.002    0.002 subtensor.py:2084(perform)
       10    0.000    0.000    0.000    0.000 theano_predictCells.py:265(getSegmentActiveSynapses)
       17    0.000    0.000    0.000    0.000 theano_predictCells.py:288(checkCellPredicting)
       17    0.000    0.000    0.000    0.000 theano_predictCells.py:298(setPredictCell)
      100    0.000    0.000    0.000    0.000 theano_predictCells.py:315(checkCellActive)
        1    0.000    0.000    0.004    0.004 theano_predictCells.py:347(updatePredictiveState)
       13    0.000    0.000    0.000    0.000 type.py:385(<lambda>)
        1    0.000    0.000    0.000    0.000 type.py:579(value_zeros)
       13    0.000    0.000    0.000    0.000 type.py:67(filter)
        7    0.000    0.000    0.000    0.000 {getattr}
        3    0.000    0.000    0.000    0.000 {hasattr}
       21    0.000    0.000    0.000    0.000 {isinstance}
       57    0.000    0.000    0.000    0.000 {len}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        3    0.000    0.000    0.000    0.000 {method 'item' of 'numpy.ndarray' objects}
        1    0.000    0.000    0.000    0.000 {method 'keys' of 'dict' objects}
        1    0.000    0.000    0.000    0.000 {numpy.core.multiarray.arange}
       13    0.000    0.000    0.000    0.000 {numpy.core.multiarray.array}
       11    0.000    0.000    0.000    0.000 {numpy.core.multiarray.zeros}
       11    0.000    0.000    0.000    0.000 {range}
        1    0.001    0.001    0.001    0.001 {theano.scan_module.scan_perform.perform}
       12    0.000    0.000    0.000    0.000 {time.time}
        3    0.000    0.000    0.000    0.000 {zip}

This shows a huge improvement with the new theano_predictCells calculator class. A 12.5 times speed up!