Open Helveg opened 2 years ago
Additionally: can we expect a publication on MDF that among other things properly investigates scaling and runtime complexity of code that has to deal with it and read/write it?
@Helveg As mentioned in #191, the types of network you are referring to here are more in the domain of NeuroML. Eventually there will be full compatibility "under the hood" between MDF and NML, but for now the issue of standardising formats for large scale spiking models is more relevant for NML, and getting NeuroMLlite working well with Arbor, Neuron, Nest, etc. is a more near term goal. Hope that helps.
Hi there! I have some questions about scalability of MDF: while connectomes between specific cells can usually be stored as some sort of sparse matrix with the from and to identifiers of the pre and postsynaptic cell, and the from and to location on the cell; this leads to scalability issues when transferring that data to the simulator:
O(N_syn)
, to filter out1/N
th of the data in the dataset. ThisO(N_syn)
iteration time assumes you can store the iterated data into a data structure withO(1)
lookups. WithoutO(1)
lookup, since you need to query the connections of each cell on your node, you're looking atO(N_syn * (N_cells/nodes))
runtimes to lookup the connections. The number of synapses is the most numerous element of a biophysical neural network. On top of that, most classicalO(1)
lookup data structures, like a hashtable, have large memory requirements: storing all your data in memory like that on each node is going to limit your scale by memory requirements; NEURON already hits memory limits on HPC at ring networks of 16k cells on 64GB RAM compute nodes, see page 6 of https://arxiv.org/pdf/1901.07454.pdf, imagine having to construct networks with the whole connectome stored in memory, or facing exponential runtime for network construction.