Open calumroy opened 7 years ago
Profiling just the np_predictCells calculator's function updatePredictiveState (note this is from a different test to the above profile hence the 50ms time vs the 122ms above).
38140 function calls in 0.050 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
4 0.000 0.000 0.000 0.000 np_predictCells.py:127(getSegmentActiveSynapses)
4 0.000 0.000 0.000 0.000 np_predictCells.py:150(checkCellPredicting)
4 0.000 0.000 0.000 0.000 np_predictCells.py:160(setPredictCell)
4 0.000 0.000 0.000 0.000 np_predictCells.py:170(setActiveSeg)
6051 0.007 0.000 0.007 0.000 np_predictCells.py:177(checkCellActive)
10230 0.032 0.000 0.040 0.000 np_predictCells.py:190(segmentNumSynapsesActive)
1 0.010 0.010 0.050 0.050 np_predictCells.py:209(updatePredictiveState)
10238 0.000 0.000 0.000 0.000 {len}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
4 0.000 0.000 0.000 0.000 {numpy.core.multiarray.zeros}
11599 0.002 0.000 0.002 0.000 {range}
The above profile was taken form a HTM layer with the following parameters
'desiredLocalActivity': 1,
'minOverlap': 2,
'wrapInput':1,
'inhibitionWidth': 30,
'inhibitionHeight': 2,
'centerPotSynapses': 1,
'connectPermanence': 0.3,
'potentialWidth': 34,
'potentialHeight': 31,
'spatialPermanenceInc': 0.1,
'spatialPermanenceDec': 0.01,
'activeColPermanenceDec': 0.0,
'tempDelayLength': 10,
'permanenceInc': 0.15,
'permanenceDec': 0.05,
'tempSpatialPermanenceInc': 0.04,
'tempSeqPermanenceInc': 0.1,
'minThreshold': 5,
'minScoreThreshold': 3,
'newSynapseCount': 10,
'maxNumSegments': 10,
'activationThreshold': 6,
'colSynPermanence': 0.1,
'cellSynPermanence': 0.4
Most time is spent iterating through the distal synapses and counting the number that are connected and linked to an active cell. The run time may be improved by implementing this class in a theano based calculator classed so this can be run on a GPU.
the theano_predictCells calculator was implemented and run with the same config for a particular layer. The same function in the new calculator class was profiled.
353 function calls in 0.004 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.000 0.000 basic.py:4352(perform)
3 0.001 0.000 0.004 0.001 function_module.py:482(__call__)
1 0.000 0.000 0.000 0.000 function_module.py:691(free)
13 0.000 0.000 0.000 0.000 numeric.py:406(asarray)
2 0.000 0.000 0.002 0.001 op.py:767(rval)
13 0.000 0.000 0.000 0.000 safe_asarray.py:12(_asarray)
1 0.000 0.000 0.001 0.001 scan_op.py:638(<lambda>)
1 0.000 0.000 0.001 0.001 scan_op.py:670(rval)
1 0.002 0.002 0.002 0.002 subtensor.py:2084(perform)
10 0.000 0.000 0.000 0.000 theano_predictCells.py:265(getSegmentActiveSynapses)
17 0.000 0.000 0.000 0.000 theano_predictCells.py:288(checkCellPredicting)
17 0.000 0.000 0.000 0.000 theano_predictCells.py:298(setPredictCell)
100 0.000 0.000 0.000 0.000 theano_predictCells.py:315(checkCellActive)
1 0.000 0.000 0.004 0.004 theano_predictCells.py:347(updatePredictiveState)
13 0.000 0.000 0.000 0.000 type.py:385(<lambda>)
1 0.000 0.000 0.000 0.000 type.py:579(value_zeros)
13 0.000 0.000 0.000 0.000 type.py:67(filter)
7 0.000 0.000 0.000 0.000 {getattr}
3 0.000 0.000 0.000 0.000 {hasattr}
21 0.000 0.000 0.000 0.000 {isinstance}
57 0.000 0.000 0.000 0.000 {len}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
3 0.000 0.000 0.000 0.000 {method 'item' of 'numpy.ndarray' objects}
1 0.000 0.000 0.000 0.000 {method 'keys' of 'dict' objects}
1 0.000 0.000 0.000 0.000 {numpy.core.multiarray.arange}
13 0.000 0.000 0.000 0.000 {numpy.core.multiarray.array}
11 0.000 0.000 0.000 0.000 {numpy.core.multiarray.zeros}
11 0.000 0.000 0.000 0.000 {range}
1 0.001 0.001 0.001 0.001 {theano.scan_module.scan_perform.perform}
12 0.000 0.000 0.000 0.000 {time.time}
3 0.000 0.000 0.000 0.000 {zip}
This shows a huge improvement with the new theano_predictCells calculator class. A 12.5 times speed up!
The current calcualtor that calculates the predictive cells is just a numpy implementation. This calculators function updatePredictiveState uses about half the total calcualtion time for a HTM step.
See below profile