Flexible memory layout for GPU

rcoreilly commented 1 year ago

This implements the plan in #195 -- preliminary tests show 4.5x GPU speedup on A100 relative to previous memory layout -- finally matching Mac M1 performance.

Also includes full data parallel support so the GPU (and CPU) can use the same weights to process N input patterns at a time in parallel, which should enable significant GPU speedup even on smaller models. This part still needs significantly more testing and updated examples to use it, and the global memory impl to make it work with PVLV / BOA stuff.

rcoreilly commented 1 year ago

Here's the list of parameter renames -- have been applied to all examples models:

Layer.Act. -> Layer.Acts.
Layer.Acts.GABAB. -> Layer.Acts.GabaB.
Layer.Acts.Spike. -> Layer.Acts.Spikes.
Layer.Acts.Attn. -> Layer.Acts.AttnMod.
Layer.Learn.CaLrn. -> Layer.Learn.CaLearn.

also, this will be version 1.8.0 due to incompatibilities.

codecov[bot] commented 1 year ago

Codecov Report

Patch coverage: 41.32% and project coverage change: -3.31 :warning:

Comparison is base (29dc4e4) 36.03% compared to head (64a9c4e) 32.73%.

Additional details and impacted files

```diff @@ Coverage Diff @@ ## master #229 +/- ## ========================================== - Coverage 36.03% 32.73% -3.31% ========================================== Files 63 76 +13 Lines 11497 13977 +2480 ========================================== + Hits 4143 4575 +432 - Misses 7113 9138 +2025 - Partials 241 264 +23 ``` | [Impacted Files](https://app.codecov.io/gh/emer/axon/pull/229?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=emer) | Coverage Δ | | |---|---|---| | [axon/deep\_layers.go](https://app.codecov.io/gh/emer/axon/pull/229?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=emer#diff-YXhvbi9kZWVwX2xheWVycy5nbw==) | `13.76% <0.00%> (ø)` | | | [axon/deep\_net.go](https://app.codecov.io/gh/emer/axon/pull/229?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=emer#diff-YXhvbi9kZWVwX25ldC5nbw==) | `0.00% <0.00%> (ø)` | | | [axon/globals.go](https://app.codecov.io/gh/emer/axon/pull/229?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=emer#diff-YXhvbi9nbG9iYWxzLmdv) | `0.00% <0.00%> (ø)` | | | [axon/globalvars\_string.go](https://app.codecov.io/gh/emer/axon/pull/229?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=emer#diff-YXhvbi9nbG9iYWx2YXJzX3N0cmluZy5nbw==) | `0.00% <0.00%> (ø)` | | | [axon/globalvtatype\_string.go](https://app.codecov.io/gh/emer/axon/pull/229?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=emer#diff-YXhvbi9nbG9iYWx2dGF0eXBlX3N0cmluZy5nbw==) | `0.00% <0.00%> (ø)` | | | [axon/helpers.go](https://app.codecov.io/gh/emer/axon/pull/229?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=emer#diff-YXhvbi9oZWxwZXJzLmdv) | `0.00% <ø> (ø)` | | | [axon/layervals.go](https://app.codecov.io/gh/emer/axon/pull/229?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=emer#diff-YXhvbi9sYXllcnZhbHMuZ28=) | `100.00% <ø> (ø)` | | | [axon/logging.go](https://app.codecov.io/gh/emer/axon/pull/229?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=emer#diff-YXhvbi9sb2dnaW5nLmdv) | `0.00% <0.00%> (ø)` | | | [axon/looper.go](https://app.codecov.io/gh/emer/axon/pull/229?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=emer#diff-YXhvbi9sb29wZXIuZ28=) | `0.00% <0.00%> (ø)` | | | [axon/neuromod.go](https://app.codecov.io/gh/emer/axon/pull/229?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=emer#diff-YXhvbi9uZXVyb21vZC5nbw==) | `48.27% <ø> (-2.33%)` | :arrow_down: | | ... and [43 more](https://app.codecov.io/gh/emer/axon/pull/229?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=emer) | |

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

emer / axon

Flexible memory layout for GPU #229

Codecov Report