djanloo / quilt

A multiscale neural simulator
MIT License
0 stars 0 forks source link

Loosing performance somewhere #23

Closed djanloo closed 1 month ago

djanloo commented 1 month ago

This is the old performance on the basal ganglia benchmark:

Building connections.. ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:04
Running network consisting of 14622 neurons for 6000 timesteps
--------------------------------------------------
**************************************************
Simulation took 8 s (1.39433 ms/step)
    Gathering time avg: 0 us/step
    Inject time avg: 147.633 us/step
Population evolution stats:
    0:
        evolution:  380.659 us/step --- 63 ns/step/neuron
        spike emission: 34.1322 us/step --- 5 ns/step/neuron
    1:
        evolution:  373.055 us/step --- 62 ns/step/neuron
        spike emission: 32.8957 us/step --- 5 ns/step/neuron
    2:
        evolution:  70.5442 us/step --- 167 ns/step/neuron
        spike emission: 9.53733 us/step --- 22 ns/step/neuron
    3:
        evolution:  97.5183 us/step --- 125 ns/step/neuron
        spike emission: 17.2243 us/step --- 22 ns/step/neuron
    4:
        evolution:  53.9407 us/step --- 207 ns/step/neuron
        spike emission: 2.33867 us/step --- 8 ns/step/neuron
    5:
        evolution:  62.5945 us/step --- 153 ns/step/neuron
        spike emission: 3.75417 us/step --- 9 ns/step/neuron
    6:
        evolution:  96.7672 us/step --- 128 ns/step/neuron
        spike emission: 3.33033 us/step --- 4 ns/step/neuron

And this is the performance on the same benchmark now:

[2024-07-03 18:51:31] - PID 124110237631616 - INFO: Evolving spiking network from t= 0 to t= 600
Running network consisting of 14622 neurons for 6000 timesteps
--------------------------------------------------
**************************************************
[2024-07-03 18:51:45] - PID 124110237631616 - INFO: Output for PerformanceManager <spiking network>
        --inject        1.0 s   --  -- 237.0 us /step for 6000 steps --
        --monitorize    374.0 us        --  -- 62.0 ns /step for 6000 steps --
        --simulation    14.0 s

[2024-07-03 18:51:45] - PID 124110237631616 - INFO: Output for PerformanceManager <Population 0>
        --evolution     4.0 s   --  -- 697.0 us /step for 6000 steps -- -- 116.0 ns /step/unit for 6000 steps and 6000 units
        --spike_emission        339.0 ms        --  -- 56.0 us /step for 6000 steps --  -- 9.0 ns /step/unit for 6000 steps and 6000 units

[2024-07-03 18:51:45] - PID 124110237631616 - INFO: Output for PerformanceManager <Population 1>
        --evolution     3.0 s   --  -- 659.0 us /step for 6000 steps -- -- 109.0 ns /step/unit for 6000 steps and 6000 units
        --spike_emission        355.0 ms        --  -- 59.0 us /step for 6000 steps --  -- 9.0 ns /step/unit for 6000 steps and 6000 units

[2024-07-03 18:51:45] - PID 124110237631616 - INFO: Output for PerformanceManager <Population 2>
        --evolution     714.0 ms        --  -- 119.0 us /step for 6000 steps -- -- 283.0 ns /step/unit for 6000 steps and 420 units
        --spike_emission        95.0 ms --  -- 15.0 us /step for 6000 steps --  -- 37.0 ns /step/unit for 6000 steps and 420 units

[2024-07-03 18:51:45] - PID 124110237631616 - INFO: Output for PerformanceManager <Population 3>
        --evolution     942.0 ms        --  -- 157.0 us /step for 6000 steps -- -- 201.0 ns /step/unit for 6000 steps and 780 units
        --spike_emission        173.0 ms        --  -- 28.0 us /step for 6000 steps --  -- 37.0 ns /step/unit for 6000 steps and 780 units

[2024-07-03 18:51:45] - PID 124110237631616 - INFO: Output for PerformanceManager <Population 4>
        --evolution     468.0 ms        --  -- 78.0 us /step for 6000 steps --  -- 300.0 ns /step/unit for 6000 steps and 260 units
        --spike_emission        25.0 ms --  -- 4.0 us /step for 6000 steps --   -- 16.0 ns /step/unit for 6000 steps and 260 units

[2024-07-03 18:51:45] - PID 124110237631616 - INFO: Output for PerformanceManager <Population 5>
        --evolution     593.0 ms        --  -- 98.0 us /step for 6000 steps --  -- 242.0 ns /step/unit for 6000 steps and 408 units
        --spike_emission        35.0 ms --  -- 5.0 us /step for 6000 steps --   -- 14.0 ns /step/unit for 6000 steps and 408 units

[2024-07-03 18:51:45] - PID 124110237631616 - INFO: Output for PerformanceManager <Population 6>
        --evolution     959.0 ms        --  -- 159.0 us /step for 6000 steps -- -- 212.0 ns /step/unit for 6000 steps and 754 units
        --spike_emission        27.0 ms --  -- 4.0 us /step for 6000 steps --   -- 6.0 ns /step/unit for 6000 steps and 754 units

...almost everything doubled. Dispiriting.

djanloo commented 1 month ago

Good news: it does not depend that much on the code.

This is a run of the old code (https://github.com/djanloo/quilt/commit/f80dc35b0a60015c6eb708e04ac2dcdfcdce8242) before the development of the multiscale stuff:

Running network consisting of 14622 neurons for 6000 timesteps
--------------------------------------------------
**************************************************
Simulation took 14 s    (2.3965 ms/step)
        Gathering time avg: 0.0111667 us/step
        Inject time avg: 220.024 us/step
Population evolution stats:
        0:
                evolution:      701.335 us/step ---     116 ns/step/neuron
                spike emission: 52.0272 us/step ---     8 ns/step/neuron
        1:
                evolution:      678.471 us/step ---     113 ns/step/neuron
                spike emission: 57.8057 us/step ---     9 ns/step/neuron
        2:
                evolution:      118.561 us/step ---     282 ns/step/neuron
                spike emission: 15.2028 us/step ---     36 ns/step/neuron
        3:
                evolution:      160.356 us/step ---     205 ns/step/neuron
                spike emission: 27.9058 us/step ---     35 ns/step/neuron
        4:
                evolution:      80.1023 us/step ---     308 ns/step/neuron
                spike emission: 3.85283 us/step ---     14 ns/step/neuron
        5:
                evolution:      99.9843 us/step ---     245 ns/step/neuron
                spike emission: 5.41767 us/step ---     13 ns/step/neuron
        6:
                evolution:      161.416 us/step ---     214 ns/step/neuron
                spike emission: 4.68317 us/step ---     6 ns/step/neuron

Bad news: my ENTIRE machine became slower in six months.

djanloo commented 1 month ago

After solving #28, quilt has become a shared library. This is the new performance:

[2024-07-08 16:34:47] - PID 140292477301888 - INFO: Ouput for PerformanceRegistrar (8 managers )
[2024-07-08 16:34:47] - PID 140292477301888 - INFO: Output for PerformanceManager <spiking network>
_________inject _____4.0 s | 774.0 us/step for 6000 steps
_____monitorize ____7.0 ms | __1.0 us/step for 6000 steps
_____simulation ___127.0 s

[2024-07-08 16:34:47] - PID 140292477301888 - INFO: Output for PerformanceManager <Population 0>
______evolution ____45.0 s | __7.0 ms/step for 6000 steps | __1.0 us/step/unit for 6000 steps and 6000 units
_spike_emission _____1.0 s | 193.0 us/step for 6000 steps | _32.0 ns/step/unit for 6000 steps and 6000 units

[2024-07-08 16:34:47] - PID 140292477301888 - INFO: Output for PerformanceManager <Population 1>
______evolution ____44.0 s | __7.0 ms/step for 6000 steps | __1.0 us/step/unit for 6000 steps and 6000 units
_spike_emission _____1.0 s | 236.0 us/step for 6000 steps | _39.0 ns/step/unit for 6000 steps and 6000 units

[2024-07-08 16:34:47] - PID 140292477301888 - INFO: Output for PerformanceManager <Population 2>
______evolution _____5.0 s | 837.0 us/step for 6000 steps | __1.0 us/step/unit for 6000 steps and 420 units
_spike_emission __397.0 ms | _66.0 us/step for 6000 steps | 157.0 ns/step/unit for 6000 steps and 420 units

[2024-07-08 16:34:47] - PID 140292477301888 - INFO: Output for PerformanceManager <Population 3>
______evolution _____7.0 s | __1.0 ms/step for 6000 steps | __1.0 us/step/unit for 6000 steps and 780 units
_spike_emission __685.0 ms | 114.0 us/step for 6000 steps | 146.0 ns/step/unit for 6000 steps and 780 units

[2024-07-08 16:34:47] - PID 140292477301888 - INFO: Output for PerformanceManager <Population 4>
______evolution _____4.0 s | 730.0 us/step for 6000 steps | __2.0 us/step/unit for 6000 steps and 260 units
_spike_emission __101.0 ms | _16.0 us/step for 6000 steps | _65.0 ns/step/unit for 6000 steps and 260 units

[2024-07-08 16:34:47] - PID 140292477301888 - INFO: Output for PerformanceManager <Population 5>
______evolution _____4.0 s | 799.0 us/step for 6000 steps | __1.0 us/step/unit for 6000 steps and 408 units
_spike_emission __137.0 ms | _22.0 us/step for 6000 steps | _56.0 ns/step/unit for 6000 steps and 408 units

[2024-07-08 16:34:47] - PID 140292477301888 - INFO: Output for PerformanceManager <Population 6>
______evolution _____7.0 s | __1.0 ms/step for 6000 steps | __1.0 us/step/unit for 6000 steps and 754 units
_spike_emission ___94.0 ms | _15.0 us/step for 6000 steps | _20.0 ns/step/unit for 6000 steps and 754 units

That is 15 times the original simulation time. This is unacceptable.

djanloo commented 1 month ago

After commit https://github.com/djanloo/quilt/commit/fff7797dfbe958ff9c5ba070604c96ba2500fa9b :

Running network consisting of 14622 neurons for 6000 timesteps
--------------------------------------------------
**************************************************
[2024-07-08 17:02:24] - PID 140529642497152 - INFO: Ouput for PerformanceRegistrar (8 managers )
[2024-07-08 17:02:24] - PID 140529642497152 - INFO: Output for PerformanceManager <spiking network>
_________inject __895.0 ms | 149.0 us/step for 6000 steps
_____monitorize ___45.0 us | __7.0 ns/step for 6000 steps
_____simulation _____9.0 s

[2024-07-08 17:02:24] - PID 140529642497152 - INFO: Output for PerformanceManager <Population 0>
______evolution _____2.0 s | 420.0 us/step for 6000 steps | _70.0 ns/step/unit for 6000 steps and 6000 units
_spike_emission __240.0 ms | _40.0 us/step for 6000 steps | __6.0 ns/step/unit for 6000 steps and 6000 units

[2024-07-08 17:02:24] - PID 140529642497152 - INFO: Output for PerformanceManager <Population 1>
______evolution _____2.0 s | 408.0 us/step for 6000 steps | _68.0 ns/step/unit for 6000 steps and 6000 units
_spike_emission __257.0 ms | _42.0 us/step for 6000 steps | __7.0 ns/step/unit for 6000 steps and 6000 units

[2024-07-08 17:02:24] - PID 140529642497152 - INFO: Output for PerformanceManager <Population 2>
______evolution __446.0 ms | _74.0 us/step for 6000 steps | 177.0 ns/step/unit for 6000 steps and 420 units
_spike_emission ___61.0 ms | _10.0 us/step for 6000 steps | _24.0 ns/step/unit for 6000 steps and 420 units

[2024-07-08 17:02:24] - PID 140529642497152 - INFO: Output for PerformanceManager <Population 3>
______evolution __608.0 ms | 101.0 us/step for 6000 steps | 130.0 ns/step/unit for 6000 steps and 780 units
_spike_emission __123.0 ms | _20.0 us/step for 6000 steps | _26.0 ns/step/unit for 6000 steps and 780 units

[2024-07-08 17:02:24] - PID 140529642497152 - INFO: Output for PerformanceManager <Population 4>
______evolution __323.0 ms | _53.0 us/step for 6000 steps | 207.0 ns/step/unit for 6000 steps and 260 units
_spike_emission ___16.0 ms | __2.0 us/step for 6000 steps | _10.0 ns/step/unit for 6000 steps and 260 units

[2024-07-08 17:02:24] - PID 140529642497152 - INFO: Output for PerformanceManager <Population 5>
______evolution __386.0 ms | _64.0 us/step for 6000 steps | 157.0 ns/step/unit for 6000 steps and 408 units
_spike_emission ___25.0 ms | __4.0 us/step for 6000 steps | _10.0 ns/step/unit for 6000 steps and 408 units

[2024-07-08 17:02:24] - PID 140529642497152 - INFO: Output for PerformanceManager <Population 6>
______evolution __611.0 ms | 101.0 us/step for 6000 steps | 135.0 ns/step/unit for 6000 steps and 754 units
_spike_emission ___23.0 ms | __3.0 us/step for 6000 steps | __5.0 ns/step/unit for 6000 steps and 754 units

[2024-07-08 17:02:24] - PID 140529642497152 - INFO: Destroyed PerformanceRegistrar at index: 0x7fcf6e6f4cb0

So we are back to a good 70 ns/neuron/step (at least for the striatum) :smirk:

djanloo commented 1 month ago

This is the performance after the discussion in #9:

Running network consisting of 14622 neurons for 6000 timesteps
--------------------------------------------------
**************************************************
[2024-07-13 16:02:38] - PID 127952158987392 - INFO: Ouput for PerformanceRegistrar (8 managers )
[2024-07-13 16:02:38] - PID 127952158987392 - INFO: Output for PerformanceManager <spiking network>
_________inject __812.4 ms | 135.4 us/step for 6000 steps
_____monitorize __448.0 us | _74.0 ns/step for 6000 steps
_____simulation _____6.8 s

[2024-07-13 16:02:38] - PID 127952158987392 - INFO: Output for PerformanceManager <Population 0>
______evolution _____1.6 s | 264.4 us/step for 6000 steps | _44.0 ns/step/unit for 6000 steps and 6000 units
_spike_emission __176.7 ms | _29.5 us/step for 6000 steps | __4.0 ns/step/unit for 6000 steps and 6000 units
_spike_handling __274.7 ms | _45.8 us/step for 6000 steps | __7.0 ns/step/unit for 6000 steps and 6000 units

[2024-07-13 16:02:38] - PID 127952158987392 - INFO: Output for PerformanceManager <Population 1>
______evolution _____1.5 s | 256.3 us/step for 6000 steps | _42.0 ns/step/unit for 6000 steps and 6000 units
_spike_emission __195.9 ms | _32.6 us/step for 6000 steps | __5.0 ns/step/unit for 6000 steps and 6000 units
_spike_handling __265.3 ms | _44.2 us/step for 6000 steps | __7.0 ns/step/unit for 6000 steps and 6000 units

[2024-07-13 16:02:38] - PID 127952158987392 - INFO: Output for PerformanceManager <Population 2>
______evolution __223.1 ms | _37.2 us/step for 6000 steps | _88.0 ns/step/unit for 6000 steps and 420 units
_spike_emission ___56.4 ms | __9.4 us/step for 6000 steps | _22.0 ns/step/unit for 6000 steps and 420 units
_spike_handling ___79.5 ms | _13.3 us/step for 6000 steps | _31.0 ns/step/unit for 6000 steps and 420 units

[2024-07-13 16:02:38] - PID 127952158987392 - INFO: Output for PerformanceManager <Population 3>
______evolution __319.3 ms | _53.2 us/step for 6000 steps | _68.0 ns/step/unit for 6000 steps and 780 units
_spike_emission __104.0 ms | _17.3 us/step for 6000 steps | _22.0 ns/step/unit for 6000 steps and 780 units
_spike_handling ___93.0 ms | _15.5 us/step for 6000 steps | _19.0 ns/step/unit for 6000 steps and 780 units

[2024-07-13 16:02:38] - PID 127952158987392 - INFO: Output for PerformanceManager <Population 4>
______evolution __148.7 ms | _24.8 us/step for 6000 steps | _95.0 ns/step/unit for 6000 steps and 260 units
_spike_emission ___11.3 ms | __1.9 us/step for 6000 steps | __7.0 ns/step/unit for 6000 steps and 260 units
_spike_handling ___75.8 ms | _12.6 us/step for 6000 steps | _48.0 ns/step/unit for 6000 steps and 260 units

[2024-07-13 16:02:38] - PID 127952158987392 - INFO: Output for PerformanceManager <Population 5>
______evolution __196.9 ms | _32.8 us/step for 6000 steps | _80.0 ns/step/unit for 6000 steps and 408 units
_spike_emission ___18.2 ms | __3.0 us/step for 6000 steps | __7.0 ns/step/unit for 6000 steps and 408 units
_spike_handling ___77.8 ms | _13.0 us/step for 6000 steps | _31.0 ns/step/unit for 6000 steps and 408 units

[2024-07-13 16:02:38] - PID 127952158987392 - INFO: Output for PerformanceManager <Population 6>
______evolution __311.8 ms | _52.0 us/step for 6000 steps | _68.0 ns/step/unit for 6000 steps and 754 units
_spike_emission ___16.0 ms | __2.7 us/step for 6000 steps | __3.0 ns/step/unit for 6000 steps and 754 units
_spike_handling __114.0 ms | _19.0 us/step for 6000 steps | _25.0 ns/step/unit for 6000 steps and 754 units

So after: