Open Hjorthmedh opened 1 week ago
@adamjhn or @ramcdougal: do you have a suggestion for this issue?
Looking into this...
First two insights:
nodes
is a slight red-herring here, as there's nothing special about the 0th species... it's just that it's the first one that happens to get called. Every call after the first one should be proportional to the number of nodes in that cell (not in the model as a whole)... these run in about 0.16
ms on my machine (with c91662
in the below slightly modified version). The problem is that initial call sets up the data structures (this is the job of _update_node_data
) for the entire portion of the simulation in the current process (has to happen sometime) and that's the part that's O(number nodes in process)... Edited to give more detailsfinitialize
runs in about 12 seconds on my machine, which isn't great but isn't overly egregious. The problem is the massive number of calls to the destructor that are removing one section at a time and reallocating memory following the end of the function. Addendum edit: Note part of the issue is that our simulation is wrapped in the function minimal_example
, so there's no way for NEURON to know we're done and aren't deleting things bit by bit. If the simulation was at the top level, the atexit
routines would be called which provide a much faster shutdown.Quite frankly, this is probably the first time anyone has tried to run this with 2000 species.
Here's a version reproducing the problem that doesn't require bluepyopt:
from neuron import h, rxd
import time
# I'm using c91662 from
# https://raw.githubusercontent.com/NeuroBox3D/NeuGen/master/NeuGen/cellData/CA1/amaral/c91662.CNG.swc
h.load_file("import3d.hoc")
class Cell:
def __init__(self):
cell = h.Import3d_SWC_read()
cell.input("c91662.swc")
i3d = h.Import3d_GUI(cell, False)
i3d.instantiate(self)
def minimal_example(NUM_MORPHS=10, SPECIES_PER_CELL=200):
species_list = []
region_list = []
cell_list = [Cell() for _ in range(NUM_MORPHS)]
for cell in cell_list:
print("Creating regions", flush=True)
region = rxd.Region(cell.dend, nrn_region="i")
region_list.append(region)
print("Creating species", flush=True)
for idx in range(SPECIES_PER_CELL):
species_name = f"species{idx}"
spec = rxd.Species(region,
d=0,
initial=1,
charge=0,
name=species_name)
species_list.append(spec)
duration = []
for idx, spec in enumerate(species_list):
# This step is slow
if idx == 0:
print(f"Calling nodes on {spec} -- This is slow!", flush=True)
else:
print(f"Calling nodes on {spec}", flush=True)
start_time = time.perf_counter()
spec.nodes
end_time = time.perf_counter()
dur = end_time - start_time
duration.append(dur)
print(f"nodes call done {dur}")
print(f"Max duration: {max(duration)} for {NUM_MORPHS} neurons")
print("Init")
start_time = time.perf_counter()
h.finitialize(-65)
end_time = time.perf_counter()
print(f"Initialization time: {end_time - start_time} seconds")
if __name__ == "__main__":
minimal_example()
Context
The call
myspecies.nodes
to get all compartments that havemyspecies
is very slow. The callh.finitialize()
is also very slow.Overview of the issue
In our minimal example it takes 11 seconds for a single call, with one neuron, and 110 seconds for a single call, in a network of 10 neurons. This appears to scale linearly with the number of neurons even though we are only requesting a list of compartments from a single neuron.
h.finitialize()
is also incredibly slow.[Provide more details regarding the issue]
We expected the function call to be much faster, and be independent of the number of neurons. This is especially important since we want to run large scale networks of neurons (10000+).
NEURON setup
Minimal working example - MWE
This example uses the morphologies in https://github.com/Hjorthmedh/BasalGangliaData/tree/main/data/neurons/striatum/dspn (The code uses glob to extract swc files from /morphology/.swc)
Logs
The init call takes AGES. Included below is the output from the python profiler.
This function call is done excessively, taking up 86.7% of the total run time! (Mostly during initialize?)
The
update_node_data
is also run excessively, especially during themyspecies.nodes
call