Closed rcoreilly closed 1 year ago
I will do this as part of the sender PR, just a reminder to do it.
The issue was write conflict on recv GBuf -- but easy enough to add atomic add just like in GPU -- works well, and sendspike now benefits from threading.
Also, SendSpike also calls PostSpike at neuron level, which would otherwise require a separate call, so probably not worth doing at this point.
It is currently by layer, which doesn't make sense..
Meanwhile, I keep having the same confusion, thinking that SendPrjns (
Prjns
array) is more fine-grained than Neurons, but I'm actually thinking aboutSendCon
, which is Neurons * sending prjns.As of now, we cannot iterate over SendCon in GPU, because we would need to store: sending neuron index, prjn idx in a slice, same size as SendCon, so we can recover those from SendCon index.
this is easily doable and not very big -- will give it a try after first pass on GPU sender-based.