Hey! I thought it's about time to do some GPU experiments. I have working protoype for that in GPU_network.jl.
Since our whole GraphStructure/GraphData business is really complicated to bring to the GPU memory I went for a quit different approach implementing a toymodel first. The main idea is to split the network in homogeneous parts (as with the old network layer stuff or more recently in #91 ). Once we have homogeneous subsystems, maybe we don't need a lot of the complex structure anymore.
Thats why i decided to build a minimal working prototype first which only works for homogeneous systems. This mini example supports both, CPU and GPU. I compared all calculations against NetworkDyanamics. Here are some benchmarks for the coreloop:
And some benchmarks for a solver run with Tsit5() (around 260 timesteps).
As of now the prototype can't handle autodiff, thats why i've choosen the explicit solver...
Hey! I thought it's about time to do some GPU experiments. I have working protoype for that in
GPU_network.jl
.Since our whole GraphStructure/GraphData business is really complicated to bring to the GPU memory I went for a quit different approach implementing a toymodel first. The main idea is to split the network in homogeneous parts (as with the old network layer stuff or more recently in #91 ). Once we have homogeneous subsystems, maybe we don't need a lot of the complex structure anymore.
Thats why i decided to build a minimal working prototype first which only works for homogeneous systems. This mini example supports both, CPU and GPU. I compared all calculations against NetworkDyanamics. Here are some benchmarks for the coreloop:
And some benchmarks for a solver run with
Tsit5()
(around 260 timesteps).As of now the prototype can't handle autodiff, thats why i've choosen the explicit solver...