Closed rcoreilly closed 1 year ago
Here's the list of parameter renames -- have been applied to all examples models:
Layer.Act.
-> Layer.Acts.
Layer.Acts.GABAB.
-> Layer.Acts.GabaB.
Layer.Acts.Spike.
-> Layer.Acts.Spikes.
Layer.Acts.Attn.
-> Layer.Acts.AttnMod.
Layer.Learn.CaLrn.
-> Layer.Learn.CaLearn.
also, this will be version 1.8.0 due to incompatibilities.
Patch coverage: 41.32
% and project coverage change: -3.31
:warning:
Comparison is base (
29dc4e4
) 36.03% compared to head (64a9c4e
) 32.73%.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
This implements the plan in #195 -- preliminary tests show 4.5x GPU speedup on A100 relative to previous memory layout -- finally matching Mac M1 performance.
Also includes full data parallel support so the GPU (and CPU) can use the same weights to process N input patterns at a time in parallel, which should enable significant GPU speedup even on smaller models. This part still needs significantly more testing and updated examples to use it, and the global memory impl to make it work with PVLV / BOA stuff.