calumroy / HTM

HTM
3 stars 0 forks source link

np_learning Calculator profiling speed #26

Open calumroy opened 7 years ago

calumroy commented 7 years ago

The np_learning calcualtor is slow.

Update the theano version of this calculator to implment all the same features as the np_learning calcualtor.

calumroy commented 7 years ago

This is the speed of the np_learning for the following htm layer.

testParameters = {
                    'HTM':
                        {
                        'numLevels': 1,
                        'columnArrayWidth': 80,
                        'columnArrayHeight': 40,
                        'cellsPerColumn': 2,

                        'HTMRegions': [{
                            'numLayers': 1,
                            'enableHigherLevFb': 0,
                            'enableCommandFeedback': 0,

                            'HTMLayers': [{
                                'desiredLocalActivity': 2,
                                'minOverlap': 2,
                                'wrapInput':1,
                                'inhibitionWidth': 2,
                                'inhibitionHeight': 3,
                                'centerPotSynapses': 1,
                                'potentialWidth': 34,
                                'potentialHeight': 31,
                                'spatialPermanenceInc': 0.1,
                                'spatialPermanenceDec': 0.02,
                                'activeColPermanenceDec': 0.02,
                                'tempDelayLength': 3,
                                'permanenceInc': 0.1,
                                'permanenceDec': 0.02,
                                'tempSpatialPermanenceInc': 0,
                                'tempSeqPermanenceInc': 0,
                                'connectPermanence': 0.3,
                                'minThreshold': 5,
                                'minScoreThreshold': 5,
                                'newSynapseCount': 10,
                                'maxNumSegments': 10,
                                'activationThreshold': 6,
                                'colSynPermanence': 0.1,
                                'cellSynPermanence': 0.4
        540615 function calls in 0.746 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.746    0.746 HTM_network.py:1002(spatialTemporal)
        1    0.000    0.000    0.000    0.000 HTM_network.py:1105(updateHTMInput)
        1    0.000    0.000    0.746    0.746 HTM_network.py:1173(spatialTemporal)
        1    0.000    0.000    0.000    0.000 HTM_network.py:599(getPotentialOverlaps)
        1    0.000    0.000    0.000    0.000 HTM_network.py:662(updateInput)
        1    0.000    0.000    0.000    0.000 HTM_network.py:673(updateOutput)
        1    0.000    0.000    0.043    0.043 HTM_network.py:716(Overlap)
        1    0.000    0.000    0.008    0.008 HTM_network.py:738(inhibition)
        1    0.000    0.000    0.598    0.598 HTM_network.py:754(spatialLearning)
        1    0.000    0.000    0.097    0.097 HTM_network.py:765(sequencePooler)
        1    0.000    0.000    0.048    0.048 HTM_network.py:777(calcActiveCells)
        1    0.000    0.000    0.025    0.025 HTM_network.py:797(calcPredictCells)
        1    0.000    0.000    0.023    0.023 HTM_network.py:809(sequenceLearning)
        1    0.000    0.000    0.000    0.000 HTM_network.py:823(temporalPooler)
        1    0.000    0.000    0.000    0.000 HTM_network.py:951(updateRegionInput)
       80    0.000    0.000    0.000    0.000 _methods.py:37(_any)
        1    0.000    0.000    0.000    0.000 basic.py:4352(perform)
        5    0.000    0.000    0.007    0.001 cc.py:1525(__call__)
        2    0.000    0.000    0.000    0.000 fromnumeric.py:43(_wrapit)
        2    0.000    0.000    0.000    0.000 fromnumeric.py:823(argsort)
       11    0.033    0.003    0.067    0.006 function_module.py:482(__call__)
        1    0.000    0.000    0.000    0.000 function_module.py:691(free)
        1    0.000    0.000    0.000    0.000 link.py:324(__get__)
        1    0.000    0.000    0.000    0.000 np_activeCells.py:213(getCurrentLearnCellsList)
        1    0.000    0.000    0.000    0.000 np_activeCells.py:221(getActiveCellsList)
        1    0.000    0.000    0.000    0.000 np_activeCells.py:225(getSegUpdates)
      272    0.012    0.000    0.013    0.000 np_activeCells.py:230(findNumSegs)
      136    0.001    0.000    0.003    0.000 np_activeCells.py:245(getSegmentActiveSynapses)
      136    0.001    0.000    0.027    0.000 np_activeCells.py:266(getBestMatchingCell)
      136    0.002    0.000    0.004    0.000 np_activeCells.py:334(newRandomPrevActiveSynapses)
      483    0.002    0.000    0.003    0.000 np_activeCells.py:359(findLeastUsedSeg)
       90    0.000    0.000    0.000    0.000 np_activeCells.py:377(checkColPrevActive)
       80    0.000    0.000    0.000    0.000 np_activeCells.py:385(checkColBursting)
        1    0.000    0.000    0.000    0.000 np_activeCells.py:401(findActiveCell)
       79    0.000    0.000    0.000    0.000 np_activeCells.py:412(findLearnCell)
      435    0.001    0.000    0.001    0.000 np_activeCells.py:421(setActiveCell)
      220    0.000    0.000    0.000    0.000 np_activeCells.py:433(setLearnCell)
     2401    0.003    0.000    0.003    0.000 np_activeCells.py:445(checkCellActive)
      107    0.000    0.000    0.000    0.000 np_activeCells.py:458(checkCellLearn)
      280    0.000    0.000    0.000    0.000 np_activeCells.py:468(checkCellPredicting)
        9    0.000    0.000    0.000    0.000 np_activeCells.py:478(segmentHighestScore)
     5525    0.016    0.000    0.018    0.000 np_activeCells.py:495(segmentNumSynapsesActive)
      552    0.002    0.000    0.020    0.000 np_activeCells.py:521(getBestMatchingSegment)
        1    0.001    0.001    0.011    0.011 np_activeCells.py:552(updateActiveCellScores)
        1    0.002    0.002    0.048    0.048 np_activeCells.py:582(updateActiveCells)
      680    0.006    0.000    0.006    0.000 np_inhibition.py:270(calcualteInhibition)
        1    0.002    0.002    0.008    0.008 np_inhibition.py:333(calculateWinningCols)
   238700    0.514    0.000    0.542    0.000 np_learning.py:67(updatePermanence)
        1    0.054    0.054    0.598    0.598 np_learning.py:78(updatePermanenceValues)
      284    0.005    0.000    0.006    0.000 np_sequenceLearning.py:101(updateCurrentSegSyn)
      142    0.000    0.000    0.009    0.000 np_sequenceLearning.py:137(adaptSegments)
    13032    0.010    0.000    0.010    0.000 np_sequenceLearning.py:168(checkCellTime)
        1    0.003    0.003    0.023    0.023 np_sequenceLearning.py:182(sequenceLearning)
      142    0.003    0.000    0.003    0.000 np_sequenceLearning.py:78(addNewSegSyn)
        1    0.000    0.000    0.000    0.000 np_temporal.py:365(updateProximalTempPool)
        1    0.000    0.000    0.000    0.000 np_temporal.py:428(updateDistalTempPool)
       19    0.000    0.000    0.011    0.001 numeric.py:406(asarray)
        3    0.000    0.000    0.000    0.000 numeric.py:79(zeros_like)
        5    0.000    0.000    0.007    0.001 op.py:742(rval)
        2    0.000    0.000    0.010    0.005 op.py:767(rval)
     1310    0.002    0.000    0.003    0.000 random.py:293(sample)
       17    0.000    0.000    0.011    0.001 safe_asarray.py:12(_asarray)
        1    0.000    0.000    0.006    0.006 scan_op.py:638(<lambda>)
        1    0.000    0.000    0.006    0.006 scan_op.py:670(rval)
        1    0.000    0.000    0.000    0.000 sdrFunctions.py:29(joinInputArrays)
        1    0.010    0.010    0.010    0.010 subtensor.py:2084(perform)
        1    0.000    0.000    0.000    0.000 theano_overlap.py:304(checkNewInputParams)
        1    0.000    0.000    0.000    0.000 theano_overlap.py:458(addVectTieBreaker)
        1    0.000    0.000    0.019    0.019 theano_overlap.py:463(maskTieBreaker)
        1    0.000    0.000    0.005    0.005 theano_overlap.py:476(getColInputs)
        1    0.000    0.000    0.000    0.000 theano_overlap.py:522(getPotentialOverlaps)
        1    0.000    0.000    0.042    0.042 theano_overlap.py:528(calculateOverlap)
        1    0.000    0.000    0.000    0.000 theano_overlap.py:564(removeSmallOverlaps)
        1    0.000    0.000    0.000    0.000 theano_predictCells.py:250(getActiveSegTimes)
        1    0.000    0.000    0.000    0.000 theano_predictCells.py:259(getSegUpdates)
        7    0.000    0.000    0.000    0.000 theano_predictCells.py:264(getSegmentActiveSynapses)
        7    0.000    0.000    0.000    0.000 theano_predictCells.py:287(checkCellPredicting)
        7    0.000    0.000    0.000    0.000 theano_predictCells.py:297(setPredictCell)
       70    0.000    0.000    0.000    0.000 theano_predictCells.py:314(checkCellActive)
        1    0.001    0.001    0.025    0.025 theano_predictCells.py:347(updatePredictiveState)
       25    0.000    0.000    0.000    0.000 type.py:385(<lambda>)
        1    0.000    0.000    0.000    0.000 type.py:579(value_zeros)
       25    0.000    0.000    0.011    0.000 type.py:67(filter)
        1    0.001    0.001    0.008    0.008 vm.py:204(__call__)
        5    0.007    0.001    0.007    0.001 {cutils_ext.cutils_ext.run_cthunk}
       25    0.000    0.000    0.000    0.000 {getattr}
     1321    0.000    0.000    0.000    0.000 {hasattr}
       33    0.000    0.000    0.000    0.000 {isinstance}
    13778    0.001    0.000    0.001    0.000 {len}
      435    0.000    0.000    0.000    0.000 {math.floor}
   233315    0.027    0.000    0.027    0.000 {max}
     1310    0.000    0.000    0.000    0.000 {method 'add' of 'set' objects}
       80    0.000    0.000    0.000    0.000 {method 'any' of 'numpy.ndarray' objects}
      875    0.000    0.000    0.000    0.000 {method 'append' of 'list' objects}
        2    0.000    0.000    0.000    0.000 {method 'argsort' of 'numpy.ndarray' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        2    0.000    0.000    0.000    0.000 {method 'flatten' of 'numpy.ndarray' objects}
        3    0.000    0.000    0.000    0.000 {method 'item' of 'numpy.ndarray' objects}
        1    0.000    0.000    0.000    0.000 {method 'keys' of 'dict' objects}
     1310    0.000    0.000    0.000    0.000 {method 'random' of '_random.Random' objects}
       80    0.000    0.000    0.000    0.000 {method 'reduce' of 'numpy.ufunc' objects}
        3    0.000    0.000    0.000    0.000 {method 'reshape' of 'numpy.ndarray' objects}
        2    0.000    0.000    0.000    0.000 {method 'tolist' of 'numpy.ndarray' objects}
     7901    0.001    0.000    0.001    0.000 {min}
        1    0.000    0.000    0.000    0.000 {numpy.core.multiarray.arange}
       22    0.011    0.000    0.011    0.000 {numpy.core.multiarray.array}
        3    0.000    0.000    0.000    0.000 {numpy.core.multiarray.copyto}
        3    0.000    0.000    0.000    0.000 {numpy.core.multiarray.empty_like}
      148    0.000    0.000    0.000    0.000 {numpy.core.multiarray.zeros}
    14336    0.003    0.000    0.003    0.000 {range}
        1    0.006    0.006    0.006    0.006 {theano.scan_module.scan_perform.perform}
       44    0.000    0.000    0.000    0.000 {time.time}
       12    0.000    0.000    0.000    0.000 {zip}

Notice the large time spent in the np_leanring calculator ~0.598 seconds HTM_network.py:754(spatialLearning)

calumroy commented 7 years ago

This is the speed of the theano_learning for the same htm layer:

     62586 function calls in 0.210 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.210    0.210 HTM_network.py:1002(spatialTemporal)
        1    0.000    0.000    0.000    0.000 HTM_network.py:1105(updateHTMInput)
        1    0.000    0.000    0.210    0.210 HTM_network.py:1173(spatialTemporal)
        1    0.000    0.000    0.000    0.000 HTM_network.py:599(getPotentialOverlaps)
        1    0.000    0.000    0.000    0.000 HTM_network.py:662(updateInput)
        1    0.000    0.000    0.000    0.000 HTM_network.py:673(updateOutput)
        1    0.000    0.000    0.039    0.039 HTM_network.py:716(Overlap)
        1    0.000    0.000    0.008    0.008 HTM_network.py:738(inhibition)
        1    0.000    0.000    0.065    0.065 HTM_network.py:754(spatialLearning)
        1    0.000    0.000    0.097    0.097 HTM_network.py:765(sequencePooler)
        1    0.000    0.000    0.049    0.049 HTM_network.py:777(calcActiveCells)
        1    0.000    0.000    0.024    0.024 HTM_network.py:797(calcPredictCells)
        1    0.000    0.000    0.024    0.024 HTM_network.py:809(sequenceLearning)
        1    0.000    0.000    0.000    0.000 HTM_network.py:823(temporalPooler)
        1    0.000    0.000    0.000    0.000 HTM_network.py:951(updateRegionInput)
        1    0.000    0.000    0.000    0.000 basic.py:4352(perform)
        5    0.000    0.000    0.007    0.001 cc.py:1525(__call__)
        2    0.000    0.000    0.000    0.000 fromnumeric.py:43(_wrapit)
        2    0.000    0.000    0.000    0.000 fromnumeric.py:823(argsort)
       12    0.090    0.008    0.128    0.011 function_module.py:482(__call__)
        1    0.000    0.000    0.000    0.000 function_module.py:691(free)
        1    0.000    0.000    0.000    0.000 link.py:324(__get__)
        1    0.000    0.000    0.000    0.000 np_activeCells.py:213(getCurrentLearnCellsList)
        1    0.000    0.000    0.000    0.000 np_activeCells.py:221(getActiveCellsList)
        1    0.000    0.000    0.000    0.000 np_activeCells.py:225(getSegUpdates)
      272    0.013    0.000    0.013    0.000 np_activeCells.py:230(findNumSegs)
      136    0.001    0.000    0.003    0.000 np_activeCells.py:245(getSegmentActiveSynapses)
      136    0.001    0.000    0.027    0.000 np_activeCells.py:266(getBestMatchingCell)
      136    0.002    0.000    0.005    0.000 np_activeCells.py:334(newRandomPrevActiveSynapses)
      483    0.002    0.000    0.003    0.000 np_activeCells.py:359(findLeastUsedSeg)
       90    0.000    0.000    0.000    0.000 np_activeCells.py:377(checkColPrevActive)
       80    0.000    0.000    0.000    0.000 np_activeCells.py:385(checkColBursting)
        1    0.000    0.000    0.000    0.000 np_activeCells.py:401(findActiveCell)
       79    0.000    0.000    0.000    0.000 np_activeCells.py:412(findLearnCell)
      435    0.001    0.000    0.001    0.000 np_activeCells.py:421(setActiveCell)
      220    0.000    0.000    0.000    0.000 np_activeCells.py:433(setLearnCell)
     2401    0.003    0.000    0.003    0.000 np_activeCells.py:445(checkCellActive)
      107    0.000    0.000    0.000    0.000 np_activeCells.py:458(checkCellLearn)
      280    0.000    0.000    0.000    0.000 np_activeCells.py:468(checkCellPredicting)
        9    0.000    0.000    0.000    0.000 np_activeCells.py:478(segmentHighestScore)
     5525    0.016    0.000    0.018    0.000 np_activeCells.py:495(segmentNumSynapsesActive)
      552    0.002    0.000    0.020    0.000 np_activeCells.py:521(getBestMatchingSegment)
        1    0.001    0.001    0.011    0.011 np_activeCells.py:552(updateActiveCellScores)
        1    0.002    0.002    0.049    0.049 np_activeCells.py:582(updateActiveCells)
      680    0.006    0.000    0.006    0.000 np_inhibition.py:270(calcualteInhibition)
        1    0.002    0.002    0.008    0.008 np_inhibition.py:333(calculateWinningCols)
      284    0.006    0.000    0.006    0.000 np_sequenceLearning.py:101(updateCurrentSegSyn)
      142    0.000    0.000    0.009    0.000 np_sequenceLearning.py:137(adaptSegments)
    13032    0.010    0.000    0.010    0.000 np_sequenceLearning.py:168(checkCellTime)
        1    0.003    0.003    0.024    0.024 np_sequenceLearning.py:182(sequenceLearning)
      142    0.003    0.000    0.003    0.000 np_sequenceLearning.py:78(addNewSegSyn)
        1    0.000    0.000    0.000    0.000 np_temporal.py:365(updateProximalTempPool)
        1    0.000    0.000    0.000    0.000 np_temporal.py:428(updateDistalTempPool)
       23    0.000    0.000    0.015    0.001 numeric.py:406(asarray)
        3    0.000    0.000    0.000    0.000 numeric.py:79(zeros_like)
        5    0.000    0.000    0.007    0.001 op.py:742(rval)
        2    0.000    0.000    0.010    0.005 op.py:767(rval)
     1310    0.002    0.000    0.003    0.000 random.py:293(sample)
       21    0.000    0.000    0.015    0.001 safe_asarray.py:12(_asarray)
        1    0.000    0.000    0.005    0.005 scan_op.py:638(<lambda>)
        1    0.000    0.000    0.005    0.005 scan_op.py:670(rval)
        1    0.000    0.000    0.000    0.000 sdrFunctions.py:29(joinInputArrays)
        1    0.010    0.010    0.010    0.010 subtensor.py:2084(perform)
        1    0.000    0.000    0.065    0.065 theano_learning.py:133(updatePermanenceValues)
        1    0.000    0.000    0.000    0.000 theano_overlap.py:304(checkNewInputParams)
        1    0.000    0.000    0.000    0.000 theano_overlap.py:458(addVectTieBreaker)
        1    0.000    0.000    0.019    0.019 theano_overlap.py:463(maskTieBreaker)
        1    0.000    0.000    0.005    0.005 theano_overlap.py:476(getColInputs)
        1    0.000    0.000    0.000    0.000 theano_overlap.py:522(getPotentialOverlaps)
        1    0.000    0.000    0.039    0.039 theano_overlap.py:528(calculateOverlap)
        1    0.000    0.000    0.000    0.000 theano_overlap.py:564(removeSmallOverlaps)
        1    0.000    0.000    0.000    0.000 theano_predictCells.py:250(getActiveSegTimes)
        1    0.000    0.000    0.000    0.000 theano_predictCells.py:259(getSegUpdates)
        7    0.000    0.000    0.000    0.000 theano_predictCells.py:264(getSegmentActiveSynapses)
        7    0.000    0.000    0.000    0.000 theano_predictCells.py:287(checkCellPredicting)
        7    0.000    0.000    0.000    0.000 theano_predictCells.py:297(setPredictCell)
       70    0.000    0.000    0.000    0.000 theano_predictCells.py:314(checkCellActive)
        1    0.001    0.001    0.024    0.024 theano_predictCells.py:347(updatePredictiveState)
       31    0.000    0.000    0.000    0.000 type.py:385(<lambda>)
        1    0.000    0.000    0.000    0.000 type.py:579(value_zeros)
       31    0.000    0.000    0.015    0.000 type.py:67(filter)
        1    0.001    0.001    0.008    0.008 vm.py:204(__call__)
        5    0.007    0.001    0.007    0.001 {cutils_ext.cutils_ext.run_cthunk}
       27    0.000    0.000    0.000    0.000 {getattr}
     1322    0.000    0.000    0.000    0.000 {hasattr}
       39    0.000    0.000    0.000    0.000 {isinstance}
    13571    0.001    0.000    0.001    0.000 {len}
      435    0.000    0.000    0.000    0.000 {math.floor}
     1435    0.000    0.000    0.000    0.000 {max}
     1310    0.000    0.000    0.000    0.000 {method 'add' of 'set' objects}
      875    0.000    0.000    0.000    0.000 {method 'append' of 'list' objects}
        2    0.000    0.000    0.000    0.000 {method 'argsort' of 'numpy.ndarray' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        2    0.000    0.000    0.000    0.000 {method 'flatten' of 'numpy.ndarray' objects}
        3    0.000    0.000    0.000    0.000 {method 'item' of 'numpy.ndarray' objects}
        1    0.000    0.000    0.000    0.000 {method 'keys' of 'dict' objects}
     1310    0.000    0.000    0.000    0.000 {method 'random' of '_random.Random' objects}
        3    0.000    0.000    0.000    0.000 {method 'reshape' of 'numpy.ndarray' objects}
        2    0.000    0.000    0.000    0.000 {method 'tolist' of 'numpy.ndarray' objects}
     1081    0.000    0.000    0.000    0.000 {min}
        1    0.000    0.000    0.000    0.000 {numpy.core.multiarray.arange}
       26    0.015    0.001    0.015    0.001 {numpy.core.multiarray.array}
        3    0.000    0.000    0.000    0.000 {numpy.core.multiarray.copyto}
        3    0.000    0.000    0.000    0.000 {numpy.core.multiarray.empty_like}
      148    0.000    0.000    0.000    0.000 {numpy.core.multiarray.zeros}
    14115    0.002    0.000    0.002    0.000 {range}
        1    0.005    0.005    0.005    0.005 {theano.scan_module.scan_perform.perform}
       48    0.000    0.000    0.000    0.000 {time.time}
       13    0.000    0.000    0.000    0.000 {zip}

The theano learning calculator takes ~ 0.024 secondsHTM_network.py:809(sequenceLearning) for the same job. This is a 25 times speed improvement!